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''Clostridium difficile vaccine" 

Introduction 

5 The invention relates to vaccines to provide immunological protection against C. 
difficile infection. 

Background 

H 10 Clostridium difficile is a common nosocomial pathogen and a major cause of 

O morbidity and mortality among hospitalised patients throughout the world [Kelly et 

^ al., 1994]. Outbreaks of C difficile have necessitated ward and partial hospital 

© closure. With the increasing elderly population and the changing demographics of 

the population, C. difficile is set to become a major problem in the 21st century. The 
r 15 spectrum of C difficile diseases range from asymptomatic carriage to mild diarrhoea 

to ftilminant pseudomembranous cohtis. Host factors rather than bacterial factors 
appear to determine the response to C. difficile [Cheng et al,, 1997; McFarland et al, 
y 1991; Shim etal., 1998]. 

m 

20 Reports indicate that hypogammaglobuUnaemia in children appears to predispose to 
the development of disease due to C difficile and that therapy with intravenously 
administered gamma globulin can be associated with the clinical resolution of 
chronic relapsing colitis due to C. difficile disease [Leung et al., 1991; Pelmutter et 
al., 1985]. A study by Mulligan et al. [1993] found elevated levels of 

25 immunoglobulins reactive with C. difficile in asymptomatic carriers as opposed to 
symptomatic patients. Recently it has been shown that patients who became 
colonised with C. difficile who had relatively low levels of serum IgG antibody 
against toxin A had a much greater risk of developing C. difficile diarrhoea [Kyne et 
ah, 2000]. 



m 
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It is clear that any advance m the understanding of C. difficile disease and methods 
of preventing or treating C. difficile diarrhoea (CDD) and other related diseases will 
be of major therapeutic potential. 
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Statements of Invention 



According to the invention there is provided a vaccine for the treatment or 
prophylaxis of C difficile associated disease, the vaccine comprising a C. difficile 
5 gene or a C. difficile peptide/polypeptide or a derivative or fragment or mutant or 
variant thereof which is immunogenic in humans. 

The invention also provides a vaccine for the treatment or prophylaxis of C difficile 
1^, associated disease, the vaccine comprising a C. difficile gene or C difficile 

y 10 peptide/polypeptide or a derivative or fragment or mutant or variant thereof to which 

C 

p immunoreactivity is detected in hidividuals who have recovered from C difficile 

m infection. 



iiE 



Preferably the gene encodes a C. difficile surface layer protein, SlpA or variant or 
15 homologue thereof. 



Preferably the peptide/polypeptide is a C difficile surface layer protein, SlpA or 
' ^ variant or homologue thereof 

20 Most preferably the vaccine comprises a chimeric nucleic acid sequence. Preferably 
the chimeric nucleic acid sequence is derived from the 5' end of the gene, encoding 
the mature N-terminal moiety of SlpA from C. difficile. 

In one embodiment of the invention the vaccine comprises a chimeric 
25 peptide/polypeptide. Preferably the amino acid sequence of the chimeric 
peptide/polypeptide is derived from the mature N-terminal moiety of SlpA from C. 
difficile. 
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Preferably the vaccine of the invention contains an amino acid sequence SEQ ID 
No.l or a derivative or fragment or mutant or variant thereof 
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Preferably the vaccine contains an amino acid sequence SEQ ED No.2 or a derivative 
or fragment or mutant or variant thereof. 

In one embodiment of the invention the vaccine contains a nucleotide sequence SEQ 
5 ID No.3 or a derivative or fragment or mutant or variant thereof; a nucleotide 
sequence SEQ ID No.4 or a derivative or fragment or mutant or variant thereof; a 
nucleotide sequence SEQ ID No.5 or a derivative or fragment or mutant or variant 
thereof; a nucleotide sequence SEQ ID No.6 or a derivative or fragment or mutant or 
U variant thereof; a nucleotide sequence SEQ ID No.7 or a derivative or fragment or 

S 10 mutant or variant thereof; a nucleotide sequence SEQ ID No.8 or a derivative or 

ff^ fragment or mutant or variant thereof; a nucleotide sequence SEQ ID No,9 or a 

p derivative or fragment or mutant or variant thereof or a nucleotide sequence SEQ ID 

^ No. 1 0 or a derivative or fragment or mutant or variant thereof 

IS 

g 15 Preferably the vaccine of the invention is in combination with at least one other C 

difficile sub-unit. 

The invention provides a vaccine for the treatment or prophylaxis of C difficile 
associated disease, the vaccine comprising the mature N-terminal moiety of a surface 
20 layer protein, SlpA of C. difficile or variant or homologue thereof which is 
immunogenic in humans. 

Most preferably the N-terminal moiety of SlpA contains an amino acid sequence 
SEQ ID No. 1. 



25 



In one embodiment of the invention the N-terminal moiety of SlpA contains an 
amino acid sequence SEQ ID No. 2. 
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The invention also provides a vaccine for the treatment or prophylaxis of C difficile 
associated disease, the vaccine comprismg an immunodominant epitope derived 
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from a C difficile gene or a C difficile peptide/polypeptide or a derivative or 
fragment or mutant or variaat thereof which is immunogenic in humans. 

Preferably the vaccine of the invention comprises a pharmaceutically acceptable 
carrier. Most preferably the vaccine is in combination with a pharmacologically 
suitable adjuvant. Ideally the adjuvant is interleukin 12. Altemativley the adjuvant 
may be a heat shock protein. 

In one embodiment of the invention the vaccine comprises at least one other 
pharmaceutical product. 

The pharmaceutical product may be an antibiotic, selected from one or more 
metronidazole, amoxycillin, tetracycline or erythromycin, clarithromycin or 
tinidazole. 

In one embodiment of the invention the pharmaceutical product comprises an acid- 
suppressing agent such as omeprazole or bismuth salts. 

The vaccine of the invention may be in a form for oral administration, intranasal 
administration, intravenous administration or intramuscular administration. 

In one embodiment of the invention the vaccine includes a peptide delivery system. 

The invention also provides an immunodominant epitope derived from a C difficile 
gene or a C difficile peptide/polypeptide or a derivative or fragment or mutant or 
variant thereof. Preferably the C difficile peptide/polypeptide contains an ammo 
acid sequence SEQ ID No. 1 or SEQ ID No.2 or a derivative or fragment or mutant or 
variant thereof. 

In one embodiment of the invention the C. difficile peptide/polypeptide contains an 
amino acid sequence SEQ ID No.3 or SEQ ID No.4 or SEQ ID No,5 or SEQ ID 
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No.6 or SEQ ID No.7 or SEQ ID No.8 or SEQ ID No. 9 or SEQ ID No. 10 or a 
derivative or fragment or mutant or variant thereof. 

The invention further provides a chimeric nucleic acid sequence derived from the 5' 
end of the slpA gene encoding the mature N-terminal moiety of SlpA from C. 
difficile which is immunogenic in humans. 

The invention also provides a chimeric peptide/polypeptide wherein the amino acid 
sequence of the chimeric peptide/polypeptide is derived from the mature N-terminal 
moiety of SlpA from C difficile. 

The invention provides a C difficile peptide comprising SEQ ID No. 1 or SEQ ID 
No. 2 or SEQ ID No. 3 or SEQ ID No. 4 or SEQ ID No, 5 or SEQ ID No. 6 or SEQ 
ID No. 7 or SEQ ID No. 8 or SEQ ID No, 9 or SEQ ID No. 10, 

One aspect of the invention provides for the use of a C difficile gene or a C difficile 
peptide/polypeptide or a derivative or fragment or mutant or variant thereof which is 
unmunogenic in humans in the preparation of a medicament for use in a method for 
the treatment or prophylaxis of C. difficile infection or C, difficile associated disease 
in a host 

Preferably the medicament which is prepared is a vaccine of the invention. 

The invention also provides a method for preparing a vaccine for prophylaxis or 
treatment of C difficile associated disease, the method comprising; 

obtaining a C. difficile gene or a C difficile peptide/polypeptide or a 
derivative or fragment or mutant or variant thereof which is immunogenic in 
humans; and 

forming a vaccine preparation comprised of said gene or peptide/polypeptide 
or derivative or fragment or mutant or variant, which is suitable for 
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administration to a host and which when administered raises an immune 
response. 

Preferably the C difficile peptide/polypeptide contains an amino acid sequence SEQ 
ED No.l or SEQ ID No.2 or a derivative or fragment or mutant or variant thereof. 

Most preferably the C difficile gene contains an amino acid sequence SEQ ID No. 3 
or SEQ ID No.4 or SEQ ID No.5 or SEQ ID No.6 or SEQ ID No,7 or SEQ ID No.8 
or SEQ ID No.9 or SEQ ID No. 10 or a derivative or fragment or mutant or variant 
thereof. 

The invention further provides a method for prophylaxis or treatment of C difficile 
associated disease, the method comprising; 

obtaining a C. difficile gene or a C. difficile peptide/polypeptide or a 
derivative or fragment or mutant or variant thereof which is immunogenic in 
humans; 

forming a vaccine preparation comprised of said gene or peptide/polypeptide 
or derivative or fragment or mutant or variant, and 

administering the vaccine preparation to a host to raise an immune response. 

One aspect of the invention provides monoclonal or polyclonal antibodies or 
fragments thereof, to a C. difficile peptide/polypeptide or a derivative or fragment or 
mutant or variant thereof which is immunogenic in humans. 

Another aspect of the invention provides monoclonal or polyclonal antibodies or 
fragments thereof, to C. difficile peptide/polypeptide or a derivative or fragment or 
mutant or variant thereof to which immunoreactivity is detected in individuals who 
have recovered from C difficile infection. 
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The invention also provides purified antibodies or serum obtained by immunisation 
of an animal with a vaccine of the invention. 

The invention provides the use of the antibodies or fragments of the invention in the 
preparation of a medicament for treatment or prophylaxis of C. difficile infection or 
C difficile associated disease. 

Preferably the antibodies or serum are used in the preparation of a medicament for 
treatment or prophylaxis of C difficile infection or C. difficile associated disease. 

Most preferably the antibodies or fragments or serum of the invention are used in 
passive immunotherapy for established C. difficile infection. 

In one embodiment of the invention the antibodies or fragment or serum of the 
invention are used for the eradication of C. difficile associated disease. 

The invention also provides use of interleukin 12 as an adjuvant in C difficile 
vaccine. 

The invention further provides use of humanised antibodies or serum for passive 
vaccination of an individual with C. difficile infection. 

Brief Description of the Drawings 

The invention will be more clearly understood from the following description thereof 
given by way of example only with reference to the accompanying figures, in 
which:- 

Fig. lA is a Western blot showing recognition of antigens from a crude 
extract of C difficile 171500 (PGR type 1) by serum antibodies fi-om a 
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patient infected with this strain. Lane 1: Pre-infection; Lane 2: Early acute; 
Lane 3: Late acute; Lane 4: Convalescent; 

Fig. IB is a Western blot showing recognition of antigens from a crude 
5 extract of C difficile 170324 (PCR type 12) by serum antibodies from a 

patient infected with this strain. Lane 1: Pre-infection; Lanes 2-5: Acute; 
Lanes 6-7: Convalescent; 

Fig. 2. is a Western blot showing recognition of antigens from two C. difficile 
10 strains of different type by serum from convalescent patients. 

Lane 1: Strain 170324 (PCR type 12), crude antigen preparation 
Lane 2: Strain 170324, surface layer protein preparation 
Lane 3: Strain 171500 (PCR type 1), crude antigen preparation 
Lane 4: Strain 171500, surface layer protein preparation. 
15 Molecular mass markers (kDa) are shown on the left; and 

Fig. 3 is an SDS-PAGE gel showing crude SLP preparations from selected 
strains of C. difficile. The gel contains 12% acrylamide, and has been stained 
for protein with Coomassie Blue. Each lane contains 5 |ig of protein. 
20 Molecular weight markers are shown on the left. 

Lanel: 171500 (PCR type 1) 
Lane 2: 172450 (PCR type 5) 
Lane 3: 170324 (PCR type 12) 
25 Lane 4: 171448 (PCR type 12) 

Lane 5: 171862 (PCR type 17) 
Lane 6: 173644 (PCR type 31) 
Lane 7: 170444 (PCR type 46) 
Lane 8: 170426 (PCR type 92) 

30 



Detailed Description of the invention 
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Two antigenic pq)tides containing SEQ ID No. 1 and SEQ ID No. 2, associated with 
two common infecting types of C. difficile, were found to be immunogenic in 
humans. The mitigenic peptides were found to induce a strong immune response in 
individuals who recover from C difficile infection. Individuals who have recovered 
5 from C. difficile infection are those individuals who have been exposed to C difficile 
or something strongly related and have recovered. This includes individuals where a 
carrier state exists in that the C difficile infection has not and will not necessarily 
become clinically significant. 

^ 10 These antigenic peptides were found to be products of the slpA gene from C. difficile 

g which is the structural gene for the surface layer protein, SlpA. The gene or its 

m products are therefore ideal candidates for the preparation of vaccines against C. 

Kl difficile, 

SI 

Q 15 Surface layer proteins (SLPs), also known as S-layers or crystalline surface layers, 

are associated with a wide range of bacterial species. They form a 2-dimensional 

ijii array, which covers the surface of the cell completely, and grows with the cell 

[Sleytr et al., 1993]. The molecular weight can range from 40 000 to 200 000 Da. 

Q The protems are typically acidic, contain a large proportion of hydrophobic amino 

flJ 20 acid residues, and have few or no sulphur-containing amino acid residues. 

Glycosylated S-layer proteins occur in some species. The precise function of S- 
layers is not always known, but since they comprise approximately 15% of the cell 
protein, it seems likely that they are important for in vivo functioning of the 
organism. In Gram positive organisms, the SLP has been shown to delay or prevent 
25 the excretion of degradative enzymes from the cell to the outside milieu, and may 
thereby create a space analagous to the periplasmic space of Gram negative bacteria. 
Many pathogenic species possess SLPs, which have been ascribed functions such as 
antiphagocytosis {Campylobacter fetus), and inhibition of complement-mediated 
killing (Aeromonas salmonicida), 

30 

Kawata et al. [1984] described the SLPs of Clostridium difficile. They showed the 
S-layer to be composed of 2 polypeptides, and demonstrated size heterogeneity for 
the polypeptides from different strains. Delmee et al. [1986] showed that crude 
extracts from C difficile strains of different serotype showed different polypeptide 
35 profiles in SDS-PAGE. Poxton et al. [1999] made similar observations using 
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purified SLP preparations. Slide agglutination [Delmee et al., 1990] has identified 21 
different serotypes, apparently distinguished by the heterogeneity of the SLP. 

Pantosti et al. [1989] isolated C. difficile from a number of patients with antibiotic- 
associated diarrhoea, and prepared SLPs from them,. Cerquetti et al. [2000] 
published N-terminal sequences of SLPs from several strains, indicating wide 
differences between strains.. In 2000 the complete DNA sequence of the C difficile 
genome was published (available at web address 

http://www.sanger.ac.Uk/Projects/C difficileA . 

The peptides of the invention were found to be encoded by a single open reading 
frame (ORF) named slpA from C. difficile. The peptides identified in our clinical 
study correspond to a lower molecular weight moiety of the slpA gene product. 
Since an immune response is also mounted against a higher molecular weight slpA 
gene product (Fig. 2), this entity may also be included in a vaccine. 

The slpA gene has been sequenced from a number of strains corresponding to 
different PGR types. The sequences of strains 171500 (PGR type 1)(NGIMB 41081; 
PHLS R13537), 172450 (PGR type 5)(PHLS R12884), 170324 (PGR type 12) 
(NGMB 41080; PHLS R12882)„ 171448 (PGR type 12) (PHLS R13550), 171862 
(PGR type 17) (PHLS R13702), 173644 (PGR type 31) (PHLS R13711), 170444 
(PGR type 46) (PHLS R12883) and 170426 (PGR type 92) (PHLS R12871) with 
translations thereof are given in Appendices 1 to 8. Substantial variation in 
nucleotide and predicted amino acid sequence was found between strains of PGR 
types 1, 5, 12, 17 and 31. The genes from strains of PGR types 46 and 92 are aknost 
identical in sequence to those of PGR type 12. When the DNA sequences of genes 
of different strains within a PGR type are compared, the sequences are almost if not 
quite identical, indicating that the potential for variation is not infinite. These 
findings are in agreement with serotyping studies [Delmee et al., 1986, 1990], and 
indicate that the production of an effective vaccine based on the slpA product is 
feasible. In this respect, the present invention includes all variant slpA genes and 
their products, individually and combined, fragments of them, and their mutants and 
derivatives. 
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One aspect of the invention provides the combination of immunodominant eptopes 
from the slpA gene products from various serotypes into a single vaccine. In this 
way a single vaccine may be used to immunise against several different C. difficile 
strains. 

5 

The most common PGR types isolated from infections in the clinical study carried 
out at St. James's Hospital, Dublin, Ireland were PGR types 1 and 12. However, a 
vaccine which elicits an intense antibody response against many infecting types 
would be therapeutically very valuable. Recombinant DNA chimera, or several 
10 chimeras, encoding contiguous immunodominant epitopes may be made for use in 
CI the vaccine. The recombinant DNA may serve as the active component in a vaccine, 

S or may be inserted into an appropriate expression system for the generation of a 

iW chimeric peptide vaccine in a suitable host. 

d 15 Chimeras can be generated by PGR amphfication of the DNA encoding peptide 

regions of interest, incorporating cleavage sites for restriction endonucleases into the 
^ primers. The amplified fragments can thus be cleaved to generate compatible ends, 

H and spliced together to create chimeras. 



20 The dominant epitopes may be identified by cleavage of the slpA products into 
fragments by agents which cleave at known sites, and by immunoblotting with 
homologous patient serum. Immunodominant peptides may be tested for their 
capacity to stimulate T-cell proUferative responses in vitro, using mouse splenic T- 
cells. 

25 

DNA vaccination involves immunisation with recombinant DNA encoding the 
antigen or epitope of interest, cloned in a vector which promotes high level 
expression in mammalian cells. Typically, the vector is a plasmid vector which 
which also replicates in a procaryotic vector such as Escherichia coli, so that the 
30 DNA can be produced in quantity. Following immunisation, the plasmid enters a 
host cell, where it remains in the nucleus, and directs synthesis of the recombinant 
polypeptide. The polypeptide stimulates the production of neutralising antibodies, as 
well as activating cytotoxic T-cells. 
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Using a DNA vaccine, it may be necessary to modify the DNA sequence to take 
accoxmt of codon usage in humans. The G+C content of mammalian DNA is much 
higher than that of C difficile. The generation of such synthetic DNA molecules, 
essentially containing numerous silent mutations, is within the scope of the 
invention. 

A peptide vaccine will ideally be made using recombinant peptides. Similar 
considerations apply as in the generation of a DNA vaccine with regard to expression 
in a different host, such as Escherichia coli, which has a different codon usage 
pattern to C. difficile. Problems of expression may be overcome by the use of a 
special host strain which carries additional copies of rare tRNAs (e.g. E, coli BL21- 
CodonPlus™-R[L from Stratagene), or by using de novo synthesis of a DNA 
segment carrying silent mutations which will enable normal expression in E, coli. 
There are many expression systems which are likely to allow high-level expression 
of slpA genes in E. coli. An example is the pBAD/Thio TOPO vector of Invitrogen, 
in which expressed genes are under control of the arabinose promoter, which is 
subject to positive and negative control, enabling very tight control of expression. In 
this vector, the recombinant protein is typically fiised to a modified thioredoxin 
carrying several histidine residues which enable purification by nickel 
chromatography. The recombinant protein can be cleaved from the thioredoxin 
moiety by enterokinase enzyme. 

Affinity chromatography may also be used with fixed antibodies or some other agent 
which strongly binds the peptide of interest to purify the protein from the native 
orgmism. 

Purified immunogenic peptides may be used in combination with other C difficile 
sub-units as a combined vaccine against C difficile. Potential candidates are the 
products of the other sip genes, which share limited homology with the slpA gene 
product and with the N-acetylmuramoyl L-alanine amidase, (CwlB), from Bacillus 
subtilis, and which may be involved in remodelling of the peptidoglycan. 

Oother purified proteins of C difficile to which constitutive antibodies are detected 
in individuals recovering from C difficile infection are also within the scope of the 
present invention 
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A deposit of Clostridium difficile strain 171500, PCR type 1, was made at the 
NCIMB on January 29, 2001, and accorded the accession number NCIMB 41081. 

A deposit of Clostridium difficile strain 170324, PCR type 12, was made at the 
NCIMB on January 29, 2001, and accorded the accession number NCIMB 41080, 

Two peptides of the invention were found to contain the following sequences: 



10 33kDa peptide 



SEQ ID No. 1 : DKTKVETADQGYTWQSKYK 



m 3 IkPa peptide 

q SEQ ID No. 2 ATTGTQGYTVVKNDGKKAVK 

d 15 

Q The invention will be more clearly understood from the following examples. 

H Example 1. Clinical Study 

20 Examination of sequential antibody responses to C. difficile among elderly patients 
who developed the disease was carried out. The study was based on the hypothesis 
that the host immune response influenced the development of Clostridium difficile 
disease. In particular we determined that a particular pattern of immune response to 
C difficile antigens correlated with the outcome of CDD. 



Materials and Methods 



Patients 



Serum was collected from over 300 patients and of these 30 patients developed 
30 CDD. The infecting strain (homologous strain) was grown from each patient. 

Strains of C difficile were typed at the Anaerobe Reference Laboratory, Wales 
[O'Neill et al., 1996]. The most common strains isolated were PCR type 1 (n = 15) 
which is the most common type causing epidemics and PCR type 12 (n = 5) which is 
also a common hospital strain. Pre-infection serum samples were obtained from 
35 patients. Acute phase sera were then collected from patients who developed C. 
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difficile disease. Convalescent sera were collected from patients who recovered. 
Protein extracts of patients' infecting C difficile strain were probed with the patients 
sera using Western blotting. IgG responses to the antigens were examined. 

5 Western blotting 

Proteins from SDS-PAGE gels were electroblotted (0.8mA/cm2 for 1 h) to PVDF 
membrane using a semi-dry blotting apparatus (Atto). Primary antibodies (human 
serum: 1/50 - 1/10,000 dilution) were detected using a 1/5000 dilution of anti-human 
IgG (horse radish peroxidase-conjugated) in combination with enhanced 
10 chemiluminesence (ECL). Blots were washed in phosphate buffered saUne (pH 7.5) 
containing Tween 20 (0.1% v/v), and incubated in the same solution comprising 
S"! dried skim milk (5% w/v) and antibodies at the appropriate concentration. Blots 

were exposed to Kodak X-OMAT film for various periods of time and developed. 



15 Results 

Overall 5 patients made a foil recovery and new antibody responses to previously 
xmrecognised antigens were evident in 4 of these patients. Three of these patients 
had C difficile belonging to PGR type 1 and one patient had C difficile PGR type 12. 
These patients developed an acute phase antibody response to previously 

20 unrecognised C. difficile antigens which persisted during convalescence (Figs. lA 
and IB). These antigens were recognised by antibodies from patients who recovered 
and represent potential candidate vaccine antigens. Fig lA shows a strong reaction 
of convalescent antibodies was observed with the 33 kDa antigen (Lane 4, arrow). 
Fig IB shows a strong reaction of convalescent antibodies was observed with the 31 

25 kDa antigen (Lanes 6 and 7, arrow). 

These antibody responses have also been found in some controls in the same ward 
who were also on antibiotics but who did not develop GDD. 

30 Example 2. Further characterisation of protective antigens 
Materials and Methods 

Partial purification and N-terminal sequencing of the 33 kDa and the 3 1 kDa proteins 

The antigens were partially purified from C. difficile based on their molecular weight 
35 using preparative continuous-elution SDS-PAGE on a model 491 Prep-Cell (Bio- 
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Rad). The appropriate antigens were subsequently identified on Western blots 
probed with serum obtained from individuals who recovered from C. difficile 
infection. 

Preparation of surface layer proteins (SLPs) 

SLPs were purified from C difficile by extracting washed cells with 8 M urea, in 50 
mM Tris HCl, pH 8.3 in the presence of a cocktail of protease inhibitors 
(Complete®, Boehringer Mannheim), for 1 h at l^TC, followed by centrifiigation for 
19 000 X g for 30 min. The SLPs were recovered in the supematant and dialysed to 
remove the urea [Cerquetti et aL, 2000]. 

Results 

The immunodominant protein which was associated with a positive outcome from C. 
difficile strain 171500 (PGR type 1) was identified and purified using preparative 
SDS-PAGE. The N-terminal region of the protein was sequenced using an Applied 
BiosystemsProcise Sequencer, viz DKTKVETADQGYTWQSKYK (SEQ ID 
No. 1) 

The antigen which was associated with a protective antibody response from the C 
difficile strain 170324 (PGR type 12) was identified and the N-terminal sequence 
obtained, viz ATTGTQGYTWKNDGKKAVK (SEQ ID No. 2). 

These sequences were used to interrogate the C difficile genome sequence using the 
TBLASTN programme, which compared our query sequences with those of the 
genome project (available at web address 

http://www.sanger.ac.uk/Projects/C_difficile/) , translated in all 6 possible reading 
frames. A nearly identical stretch of sequence was identified when the sequence 
from strain 1710324 (type 12) was used for interrogation. The same stretch of 
sequence was picked up with the sequence from strain 171500 (type 1) was used, 
although the identity was much less strong. Since the homologous sequence 
belonged to an open reading frame encoding a 719-residue peptide, this result was 
somewhat surprising. However, when the N-terminal sequences from the higher 
molecular weight SLP component were later published by Gerquetti et al [2000], it 
became apparent that they were encoded downstream along the same gene, 
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subsequently identified as slpA, and the reason for the discrepancy in size between 
the gene and its products became readily apparent. 

The purified SLPs fi-om strains 171500 (PGR type 1) and 170324 (PGR type 12) 
showed strong reactivity with homologous convalescent serum, and co-migrated 
with the dominant antigens detected in crude cell extracts as shown in Fig. 2. Lanes 
1 and 3 contain crude antigen preparations from PGR types 1 and 12 respectively, 
and Lanes 2 and 4 contain SLP preparations from PGR types 1 and 12, respectively. 
Panel A was probed with serum from a patient recovering from infection with PGR 
type 1, and Panel B was probed with serum from a patient recovering from infection 
with PGR type 12. Each serum detected 2 major antigens in the infecting strain 
(Panel A, Lane 3); (Panel B, Lane 1), which co-migrated with the 2 SLPs (Panel A, 
Lane 4; Panel B, Lane 2), with which the sera also reacted strongly. Note that serum 
from the patient infected with the PGR type 1 strain recognised the higher molecular 
weight SLP from the PGR type 12 strain (Panel A, Lanes 1 and 2), whereas the 
converse did not occur (Panel B, Lanes 3 and 4). There is no apparent antigenic 
cross-reactivity with regard to the lower molecular weight SLPs. 

SLPs were prepared from selected strains by urea extraction, and subjected to SDS- 
PAGE and staining with Goomassie Blue (Fig. 3). Most strains showed a 
characteristic profile, with two major bands located in the 29 000 to 36 000 and 45 
000 to 50 000 molecular weight range. An exception was strain 172450 (Fig, 3, 
Lane 2), which showed a single, high molecular weight band, approximately 43 000 
in size. 

Gloning. sequencing and analysis of slpA genes 

The nucleotide sequences of the slpA genes from the two sample strains of C. 
difficile (PGR types 1 and 12, deposited at the NGIMB) and of several others (PGR 
types 5, 12, 17, 31, 46 and 92, available from the Anaerobe Reference Unit at the 
Department of Medical Microbiology and PubUc Health Laboratory, Gardiff, Wales 
were obtained. The slpA gene and flanking sequence was amplified by polymerase 
chain reaction from genomic DNA prepared from C difficile using a commercial kit 
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(Puregene® DNA isolation kit for yeast and Gram positive bacteria, Centra systems 
Minneapolis, MN). The forward primer (5' ATGGATTATTATAGAGATGTGAG 
3'), was based on sequence from the genome sequencing project, starting 112 
nucleotides upstream from the st^ of the slpA open reading frame. Two reverse 
5 primers were used, depending on the PGR type. A downstream primer (5' 
CTATTTAAAGTTTTATTAAAACTTATATTAC 3') was used to amplify slpA 
from PGR types 12, 17, 31, 46 and 92, A reverse primer based on the 3' end of the 
slpA open reading frame from strain 630 and the subsequent nonsense codon (5' 
TTACATATCTAATAAATCTTTCATTTTGTTTATAACTG 3') was used to 
10 amplify slpA from PGR types 1 and 5. The choice of primer for the latter two PGR 
types may have resulted in a small number of systematic errors in the nucleotide 
sequence obtained. PGR was carried out using HotStar™ Taq polymerase (Qiagen 
Ltd., Crawley, West Sussex, UK) according to the manufacturer's instructions. A 
single fragment of approximately 2 kb was obtained for each strain, which was then 
Q 15 cloned into the pBAD/Thio TOPO vector (Invitrogen, Groningen, Netherlands). 

Inserts were sequenced from both ends by standard procedures in commercial 
rtj facilities at MWG (Wolverton Mill South, Milton Keynes, UK) and Cambridge 

f ^ University. New primers were designed on the basis of initial sequencing results, 

,3 enabling sequencing of both strands to be completed (a process known as 

py 20 chromosome walking). 

The results are shown in Appendices 1-8. 

The nucleotide sequences were translated to enable prediction of the ammo acid 
25 sequence(s) of the product(s) (Appendices 1-8). The N- terminal sequences obtained 
experimentally for the low molecular weight protective antigens from strains 171500 
(PGR type 1) and 170324 (PGR type 12) were almost identical to those predicted 
from the nucleotide sequences of their respective slpA genes (18/20 identical 
residues for strain 171500, and 19/20 identical residues for strain 170324). 

30 

Appendix 1 shows the open reading frame with translation for slpA from strain 
171500 (PGR type 1), SEQ ID No 3. Since the reverse primer was based on the 35 
nucleotides from the 3' end of the sip A gene, the sequence is not necessarily 100% 
accurate in this region. However, this part of the gene does not seem to vary greatly 
35 from strain to strain. 
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Appendix 2 shows the open reading frame with translation for slpA from strain 
172450 (PGR type 5), SEQ ID No 4. Again, the sequence obtained for the 3' 35 
nucleotides is not fully reliable. This gene is considerably smaller than the other 
5 slpA genes sequenced, and shows strong sequence divergence from the other PGR 
types examined. 

Appendix 3 shows the open reading frame with translation for slpA from strain 
170324 (PGR type 12) , SEQ ID No 5. This gene showed a single base difference 
10 when compared with the strain used for the genome sequencing project, strain 630, 
of the same PGR type. The deduced amino acid sequence is identical. 

m Appendix 4 shows the open reading frame with translation for slpA from strain 

^ 171448 (PGR type 12), SEQ ID No 6. This gene was ahnost identical in sequence to 

jp 15 that &om strain 1 70324. 

m Appendix 5 shows the open reading frame with translation for slpA from strain 

171862 (PGR type 17), SEQ ID No 7. 

HJ 20 Appendix 6 shows the open reading frame with translation for slpA from strain 

173644 (PGR type 31), SEQ ID No 8, Like the slpA from strain 172450, this 
sequence is very dissimilar to those of slpA genes from other PGR types 
encountered. 

25 Appendix 7 shows the open reading frame with translation for slpA from strain 
170444 (PGR type 46), SEQ ID No 9. This sequence is virtually identical to that 
obtained for slpA from PGR type 12 and 92 strains. 

Appendix 8 shows the open reading frame with translation for slpA from strain 
30 170426 (PGR type 92), SEQ ID No 10. This sequence is virtually identical to that 
obtained for slpA from PGR type 12 and 46. 



35 



The cleavage site of the putative signal sequences from both genes was determined 
from experimental evidence (the N-terminal sequence of the mature proteins as 
determined by Edman degradation), and by the prediction tool of the Gentre for 



19 



TR1002/C 



Biological Sequence Analysis at the Technical University of Denmark [Nielsen et 
al., 1997]. The site for cleavage of the slpA gene product to form the mature SLPs 
was predicted from experimental [Cerquetti et al., 2000, Karjalainen et al., 2001 and 
Calabi et aL, 2001]. The cleavage site is typically preceded by the motif TKS. 
However, the relevant motif is likely to be TKG in strain 173644 (PGR type 31). No 
obvious motif appeared for strain 172450 (PGR type 5). However, the protein 
produced by type 5 strains does appear to be cleaved; hence we predicted the site to 
occur at a point where the SLP sequence aHgns with the cleavage sites of other PGR 
types. 

The molecular weight and isoelectric point was calculated for each of the predicted 
mature proteins by the ExPASy server of the Swiss Institute for Bioinfonnatics 
(Table 1). In general, the calculated molecular weights were in fair agreement with 
apparent molecular masses determined from migration in gels (Fig. 3). No lower 
molecular weight band was apparent for Strain 172450 (PGR type 5; Lane 2). 
However, a higher molecular weight band is present, which is similar in size to the 
predicted weight for the G-terminal moiety. We observed a similar profile for 
another type 5 strain. It is possible that the lower molecular weight species is subject 
to degradation in this strain. Another possibility is that it is heavily glycosylated, 
which can affect staining. All peptides had a predicted isoelectric point below 7, 
typical of acidic proteins, and characteristic of SLPs in general [Sleyter et al, 1993], 



Table 1 



C difficile strain (PCR type) 


Pl 

(N-terminal) 


(C-terminal) 


MW 
(N-terminal) 


MW 
(C-terminal) 


171500 (Type 1) 


4.83 


4.66 


33365.41 


44220.37 


172450 (Type 5) 


4.86 


4.65 


19364.46 


42757.63 


170324 (Type 12) 


4.92 


4.58 


34228.25 


39522.24 


171448 (Type 12) 


4.98 


4.58 


34156.18 


39492.21 


171862 (Type 17) 


5.09 


4.53 


33783.73 


39407.11 


173644 (Type 31) 


5.05 


4.56 


33626.48 


41821.69 


170444 (Type 46) 


5.06 


4.58 


34230.31 


39522.24 


170426 (Type 92) 


4.99 


4.58 


34242.32 


39522.24 
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The translated nucleotide sequences were compared with pubUshed SlpA sequences 
(EMBL Accession numbers AJ300676, and AJ300677 for examples from PGR types 
1, and 17 respectively; strain 630 available from the Sanger Institute for PGR type 
12; EMBL Accession number AY004256 for a variant from an unnamed PGR type). 
The Clustal W alignment programme, which is freely available, was used. Where 
SlpA sequences from our isolates were compared with those of other strains of the 
same PGR types, they were found to be nearly or quite identical. This observation 
indicates, together with existing knowledge from serotyping, that the number of 
variants of slpA is not infinite, and that natural evolution of the gene is not rapid. 
Table 2 shows a compilation of homologies, based on amino acid residue identity, 
for the different translated sequences measured against pubUshed sequences. 
Homologies are compiled for the predicted mature peptides, either combined (Table 
2A) or as N-terminal (low molecular weight, less conserved moiety) (Table 2B) and 
G-terminal (high molecular weight, more conserved) (Table 2G) mature peptides 
according to predicted cleavage sites. It is clear that the SlpA sequences from strains 
172450 (PGR type 5) and 173644 (PGR type 31) are quite distinct particularly with 
respect to N-terminal region. 

Table 2A 



Strain.type 


630 
(type 12) 


AJ300676 
(typel) 


AJ300677 
(type 17) 


AY004256 
(type unknown) 


1715O0.typel 


55.2 


99.7 


55.4 


56.42 


172450.type5 


49.8 


54.0 


49.9 


47.77 


170324.typel2 


100.0 


57.8 


81.7 


59.77 


171448.typel2 


99.7 








171862.typel7 


82.3 


58.7 


100 


57.54 


173644.type31 


57.9 


59.2 


60.1 


56.88 


170444.type46 


99.6 








170426.type92 


99.9 








Table 2B 


Strain.type 


630 
(type 12) 


AJ300676 
(type 1) 


AJ300677 
(type 17) 


AY004256 
{XypQ unknown) 


171500.typel 


35.4 


100 


34.5 


33.54 
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172450.type5 


31.6 






24.58 




100 


34.9 


64.6 


36.14 


171448.typel2 


99.7 








171862.typel7 


64.3 


34,4 


100 


31.55 


173644.type31 


37.5 


34.1 


41.3 


31.86 


170444.type46 


99.1 








170426.type92 


99.7 
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Table 2C 



Strain.type 


630 


AJ300676 


AJ300677 


AY004256 




(type 12) 


(type 1) 


(type 17) 


(type unknown) 


171500.typel 


70.2 


99.5 


71.2 


73.80 


172450.type5 


58.4 


60.4 


63.0 


57.60 


170324.typel2 


100 


77.3 


97.1 


80.00 


171448.typel2 


99.7 








171862.typel7 


97.3 


78.8 


100 


79.62 


173644.type31 


74.1 


78.9 


75.1 


75.38 


170444.type46 


100 








170426.type92 


100 









m 
m 

SI 



^3 
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The tenn antibody used throughout tiie specification includes but is not limited to 
polyclonal, monoclonal, chimeric, single chain, Fab fragments and fragments 
produced by a Fab expression library. 

The antibodies and fragments thereof may be humanised antibodies. Neutralising 
antibodies such as those which inhibit biological activity of the substance amino acid 
sequence are especially preferred for diagnostics and therapeutics. 



Antibodies both polyclonal and monoclonal which are directed against epitopes 
15 obtainable from a polypeptide or peptide of the present invention are particularly 
useful in diagnosis and those which are neutralising are useful in passive 
immxmotherapy. 
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Antibodies may be produced by any of the standard techniques well known in the art. 

A therapeutically effective amount of the polypeptide, polynucleotide, peptide or 
antibody of the inventiion in the form of pharmaceutical composition may be 
adminsistered. The composition may optionally comprise a pharmaceutically 
acceptable carrier, diluent or excipients and including combinations thereof The 
pharmaceutical composition may be used in conjugation with one or more additional 
pharmaceutically active compounds and/or adjuvants. 
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Different adjuvants depending on the host may be used to increase immunological 
response. The adjuvant may be selected from the group comprising Freunds, mineral 
gels such as aluminium hydroxide and surface active substances. 

The vaccine of the invention may be in the form of an immune modulating 
composition or pharmaceutical composition and may be administered by a number of 
different routes such as by injection (which includes parenteral, subcutaneous and 
intramuscular injection) intranasal, intramuscular, mucosal, oral, intra-vaginal, 
urethral or ocular administration. There may be different formulation/composition 
requirements dependent on the different delivery systems. 



The invention is not limited to the embodiments hereinbefore described which may 
be varied in detail. 
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Appendix 1 

SEQ ID No. 3. Nucleotide sequence of slpA from Clostridium 
difficile strain 171500, PGR type 1, with translation. The putative 
secretory signal cleavage site {□) and site of cleavage to form the 

two mature SLPs (♦) are indicated. 

1 ATGAATAAGAAAAATATAGCAATAGCTATGTCAGGTTTAACAGTTTTAGCTTCGGCTGCA 6 0 
+ + + ^ ^ 

IMNKKNIAIAMSGLTVLASAA 

20 

61 

CCTGTATTTGCAGATGATACAAAAGTTGAAACTGGTGATCAAGGATATACAGTGGTACAA 120 

+ + + + ^ 

+ 

21PVFADDTKVETGDQGYTVVQ 

40 

□ 

121 

AGCAAGTATAAGAAAGCTGTTGAACAATTACAAAAAGGAATATTAGATGGAAGTATAACA 180 

+ ^ ^ + + 

41SKYKKAVEQLQKGILDGSIT 

60 

181 

GAAATTAAAGTTTTCTTTGAGGGAACTTTAGCATCTACTATAAAAGTAGGTTCTGAGCTT 24 0 

+ + + + + 

61EIKVFFEGTLASTIKVGSEL 

80 

241 

AATGCAGCAGATGCAAGTAAATTATTGTTTACACAAGTAGATAATAAACTAGATAATTTA 300 

+ + + + + 

81NAADASKLLFTQVDNKLDNL 

100 

301 

GGTGATGGAGATTATGTAGATTTCTTAATAACTTCTCCAGGTCAAGGGGATAAAATAACT 360 

+ + + + + 

lOlGDGDYVDFLITSPGQGDKIT 

120 

361 

ACAAGTAAACTTGTTGCATTGAAAGATTTAACAGGTGCTTCAGCAGATGCTATAATTGCT 420 

+ + + + ^ 

121TSKLVALKDLTGASADAI lA 

140 

421 

GGAACATCTTCAGCAGATGGTGTTGTTACAAATACTGGAGCTGCTAGTGGTTCTACTGAG 480 

+ + 4. + ^ 

141 GTSSADGVVTNTGAASGSTE 

160 
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481 

ACAAATTCAGCAGGAACAAAACTTGCAATGTCAGCTATTTTTGACACAGCATATACAGAT 54 0 
^ + ^ ^ ^ 

161 TNSAGTKLAMSAIFDTAYTD 

180 

541 

TCATCTGAAACTGCGGTTAAGATTACTATAAAAGCAGATATGAATGATACTAA?ITTTGGT 60 0 
^ ^ ^ ^ ^ 

181 SSETAVKITIKADMNDTKFG 

200 

601 

AAAGCAGGTGAGACAACTTATTCAACTGGGCTTACATTTGAAGATGGGTCTACAGAAAAA 660 
+ + + + + 

201KAGETTYSTGLTFEDGSTEK 

220 

661 

ATTGTTAAATTAGGGGACAGTGATATTATAGATATAACTAAAGCTCTTAAACTTACTGTT 720 

+ + + ^ ^ 

221 IVKLGDSDIIDITKALKLTV 

240 

721 

GTTCCTGGAAGTAAAGCAACTGTTAAGTTTGCTGAAAAAACACCAAGTGCCAGTGTTCAA 780 

+ ^ + + ^ 

241 VPGSKATVKFAEKTPSASVQ 

260 

781 

CCAGTAATAACAAAGCTTAGAATAATAAATGCTAAAGAAGAAACAATAGATATTGACGCT 840 

+ + + 4. ^ 

261 PVITKLRIINAKEETIDIDA 

280 

841 

AGTTCTAGTAAAACAGCACAAGATTTAGCTAAAAAATATGTATTTAATAAAACTGATTTA 900 

^ + + + ^ 

281 SSSKTAQDLAKKYVFNKTDL 

300 

901 

AATACTCTTTATAAAGTATTAAATGGAGATGAAGCAGATACTAATGGATTAATAGAAGAA 960 

4. ^ + ^ ^ 

301 NTLYKVLNGDEADTNGLIEE 

320 

961 

GTTAGTGGAAAATATCAAGTAGTTCTTTATCCAGAAGGAAAAAGAGTTACAACTAAGAGT 1020 

+ ^ ^ + 4. 

321 VSGKYQVVLYPEGKRVTTKS 

340 

1021 

GCTGCAAAGGCTTCAATTGCTGATGAAAATTCACCAGTTAAATTAACTCTTAAGTCAGAT 1080 
4. + 4. 4. ^ 

341 AAKAS lADENSPVKLTLKSD 

360 

♦ 

1081 

AAGAAGAAAGACTTAAAAGATTATGTGGATGATTTAAGAACATATAATAATGGATATTCA 114 0 

4. ^ ^ 4. ^ 

361 KKKDLKDYVDDLRTYNNGYS 

380 
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1141 

AATGCTATAGAAGTAGCAGGAGAAGATAGAATAGAAACTGCAATAGCATTAAGTCAAAAA 12 00 
381 NAIEVAGEDRIETAIALSQK 

400 

1201 

TATTATAACTCTGATGATGAAAATGCTATATTTAGAGATTCAGTTGATAATGTAGTATTG 12 60 
401 YYNSDDENAIFRDSVDNVVL 

420 

1261 

GTTGGAGGAAATGCAATAGTTGATGGACTTGTAGCTTCTCCTTTAGCTTCTGAAAAGAAA 132 0 
+ + + + + 

421 VGGNAIVDGLVASPLASEKK 

440 

1321 

GCTCCTTTATTATTAACTTCAAAAGATAAATTAGATTCAAGCGTAAAAGCTGAAATAAAG 1380 
^ ^ + + + 

441APLLLTSKDKLDSSVKAEIK 

460 

1381 

AGAGTTATGAATATAAAGAGTACAACAGGTATAAATACTTCAAAGAAAGTTTATTTAGCT 1440 
+ + + 4. + 

461 RVMNIKSTTGINTSKKVYLA 

480 

1441 

GGTGGAGTTAATTCTATATCTAAAGAAGTAGAAAATGAATTAAAAGATATGGGACTTAAA 1500 
+ + 4, + + 

481 GGVNSISKEVENELKDMGLK 

500 

1501 

GTTACAAGATTAGCAGGAGATGATAGATATGAAACTTCTCTAAAAATAGCTGATGAAGTA 1560 
+ + + + + 

501VTRLAGDDRYETSLKIADEV 

520 

1561 

GGTCTTGATAATGATAAAGCATTTGTAGTTGGAGGAACAGGATTAGCAGATGCCATGAGT 1620 

+ + ^ + + 

521 GLDNDKAFVVGGTGLADAMS 

540 

1621 

ATAGCTCCAGTTGCATCTCAATTAAGAAATGCTAATGGTAAAATGGATTTAGCTGATGGT 1680 

_i. + ^ ^ + 

541 lAPVASQLRNANGKMDLADG 

560 
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1681 

GATGCTACACCAATAGTAGTTGTAGATGGAAAAGCTAAAACTATAAATGATGATGTAAAA 174 0 
+ ^ ^ 4. ^ 

561 DATPIVVVDGKAKTINDDVK 

580 

1741 

GATTTCTTAGATGATTCACAAGTTGATATAATAGGTGGAGAAAACAGTGTATCTAAAGAT 1800 
581DFLDDSQVDIIGGENSVSK0 

600 

1801 

GTTGAAAATGCAATAGATGATGCTACAGGTAAATCTCCAGATAGATATAGTGGAGATGAT 1860 
^ ^ + + + 

eOlVENAIDDATGKSPDRYSGDD 

620 

1861 

AGACAAGCAACTAATGCAAAAGTTATAAAAGAATCTTCTTATTATCAAGATAACTTAAAT 1920 
4. ^ + 4. 4. 

621 RQATNAKVIKESSYYQDNLN 

640 

1921 

AATGATAAAAAAGTAGTTAATTTCTTTGTAGCTAAAGATGGTTCTACTAAAGAAGATCAA 198 0 
^ ^ 4. + + 

641 NDKKVVNFFVAKDGSTKEDQ 

660 

1981 

TTAGTTGATGCTTTAGCAGCAGCTCCAGTTGCAGCAAACTTTGGTGTAACTCTTAATTCT 2040 
+ + + + + 

661 LVDALAAAPVAANFGVTLNS 

680 

2041 

GATGGTAAGCCAGTAGATAAAGATGGTAAAGt ATTAACTGGTTCTGATAATGATAAAAAT 2100 
+ ^ + ^ + 

681 DGKPVDKDGKVLTGSDNDKN 

700 

2101 

AAATTAGTATCTCCAGCACCTATAGTATTAGCTACTGATTCTTTATCTTCAGATCaAAGT 2160 
+ + + 4. + 

701 KLVSPAPIVLATDSLSSDQS 

720 

2161 

GTATCTATAAGTAaAGTTCTTGATAAAGATAATGGAGAAAACTTAGTTCAAGTTGGTAAA 222 0 

^ ^ ^ 4. + 

721 VSISKVLDKDNGENLVQVGK 

740 

2221 GGTATAGCTACTTCAGTTATAAACAAAATGAAAGATTTATTAGATATG 2268 
+ 4. + 4. 

741 GIATSVINKMKDLLDM 756 
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Appendix 2 

SEQ ID No, 4. Nucleotide sequence of slpA from Clostridium 
difficile strain 172450, PCR type 5, with translation. The 
putative secretory signal cleavage site (□) is indicated/ and an 
approximation of the and site of cleavage to form the two mature 

SLPs (♦) is also indicated. 
1 

ATGAAAAAAAGAAATTTAGCAATGGCTATGGCAGCTGTTACTGTAGTAGGTTCTGCTGCT 6 0 

+ + + + + 

IMKKRNLAMAMAAVTVVGSAA 

20 

61 

CCAGTTTTTGCAGCAGCTTCAGATGTAATATCACTACAAGATGGTACAAATGATAAGTAT 120 

^ + 4 ^ + 

21PVFAAASDVISLQDGTNDKY 

40 

□ 

121 

ACAGTATCAAATACTAAAGCTAGTGACTTAGTAAAGGATATTTTAGCAGCACAAAACTTA 180 

4. ^ + + + 

41TVSNTKASDLVKDILAAQNL 

60 

181 

ACAACAGGTGCAGTTATTTTGAACAAAGATACAAAAGTTACTTTCTATGATGCAAATGAG 24 0 

^ ^ + + + 

61TTGAVILNKDTKVTPYDANE 

80 

241 

AAAGATTCTTCAACTCCAACTGGAGATAAAAAAGTTTATTCAGAACAAACTTTAACTACA 300 

+ ^ 4. + + 

81KDSSTPTGDKKVYSEQTLTT 

100 

301 

GCTAATGGAAATGAAGATTATGTAAAGACAACTTTAAAAAATTTAGATGCAGGAGAATAT 360 

4 ^ + + + 

101 ANGNEDYVKTTLKNLDAGEY 

120 

361 

GCTATTATAGATTTAACTTATAATAATGCTAAAACTGTTGAAATTAAAGTAGTAGCAGCT 420 

+ + + 4. 4. 

121AIIDLTYNNAKTVEIKVVAA 

140 

421 

AGTGAAAAAACAGTAGTTGTATCTAGTGATGCGAAAAATAGTGCAAAAGATATAGCTGAA 480 

4. 4. + + + 

141 SEKTVVVSSDAKNSAKDIAE 

160 

481 

AAATATGTGTTTGAAGACAAAGACTTAGAAAATGCACTAAAAACTATAAATGCCTCAGAT 540 

+ 4. 4. ^ + 

161 KYVFEDKDLENALKTINASD 

180 
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541 

TTCAGTAAAACTGATAGTTACTATCAAGTAGTTCTTTATCCAAAAGGAAAGAGATTACAA 60 0 
+ ^ ^ ^ ^ 

ISIFSKTDSYYQVVLYPKGKRLQ 

200 

601 

GGTTTCTCAACTTATAGAGGTACAAATTATAATGAAGGAACTGCATATGGTAATACACCA 6 60 
+ ^ + + ^ 

201 GFSTYRATNYNEGTAYGNTP 

220 

♦ 

661 

GTAATATTAACTCTAAAATCTACTAGTAAGAGTAATTTAAAGACTGCAGTAGAAGAGTTA 72 0 
+ + ^ ^ ^ 

221 VILTLKSTSKSNLKTAVEEL 

240 

721 

CAAAAATTGAATGCTAGTTATTCTAATACTAC AACTTTAGCTGGTGATGACAGAATACAA 7 8 0 

+ + + ^ + 

241 QKLNASYSNTTTLAGDDRIQ 

260 

781 

ACAGCTATAGAGATAAGTAAAGAATATTACAATAATGATGGCGAGAAATCAGATCATTCA 84 0 
^ ^ + + + 

261 TAIEISKEYYNNDGEKSDHS 

280 

841 

GCTGATGTTAAAGAGAATGTTAAAAATGTTGTATTAGTAGGTGCAAATGCACTAGTAGAT 900 

+ + + 4. 4. 

281 ADVKENVKNVVLVGANALVD 

300 

901 

GGATTAGTTGCGGCTCCTTTAGCAGCAGAAAAAGATGCTCCACTATTATTAACTTCAAAA 960 

+ + + + 4. 

SOlGLVAAPLAAEKDAPIiliLTSK 

320 

961 

GATAAATTAGATTCGTCAGTAAAATCTGAAATAAAGAGAGTTTTAGACTTAAAAACTTCA 1020 
+ + + 4. + 

321DKLDSSVKSEIKRVLDLKTS 

340 

1021 

ACAGAAGTAACAGGAAAAACAGTTTATATAGCTGGTGGAGTTAATAGTGTATCTAAAGAA 1080 
4. ^ ^ ^ + 

341 TEVTGKTVYIAGGVNSVSKE 

360 

1081 

GTTGTAACAGAATTAGAATCAATGGGATTAAAAGTTGAAAGATTCTCAGGTGATGATAGA 1140 
+ + + + 4. 

361 VVTELESMGLKVERFSGDDR 

380 

1141 

TATGAAACTTCTTTAAAAATAGCAGGTGAAATAGGCTTAGATAATGATAAGGCTT ATGTA 1200 
+ + + 4. + 
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381YETSLKIAGEIGLDNDKAYV 

400 

1201 

GTTGGTGGAACAGGATTAGCAGATGCCATGAGTATAGCTTCAGTTGCTTCTACTAAATTA 1260 
^ + + + + 

401 VGGTGLADAMSIASVASTKL 

420 

1261 

GATGGTAATGGTGTTGTAGATAGAACAAATGGACATGCTACTCCAATAGTTGTTGTAGAT 1320 

+ + + + + 

421 DGNGVVDRTNGHATPIVVVD 

440 

1321 

GGAAAAGCTGATAAAATATCTGATGACTTAGATAGTTTCTTAGGAAGCGCTGATGTAGAT 1380 
+ + + + + 

441 GKADKISDDLDSFLGSADVD 

460 

13 81 ATAATAGGTGGATTTGCAAGTGTATCTGAAAAGATGGAAGAAGCTATATCAGATGCTACT 
1440 

^ + + + + 

461 IIGGFASVSEKMEEAISDAT 

480 

1441 

GGTAAAGGCGTTACAAGAGTTAAAGGCGACGATAGAC AAGACACTAACTCTGAAGTTATA 1500 

+ + + + + 

481 GKGVTRVKGDDRQDTNSEVI 

500 

1501 

AAAACATATTATGCTAATGATACTGAAATAGCTAAAGCTGCAGTTTTAGATAAAGATTCA 1560 
+ ^ + + + 

501 KTYYANDTEIAKAAVLDKDS 

520 

1561 

GGTGCTTCAAGTAGTGATGCAGGAGTATTTAATTTCTATGTAGCTAAAGATGGATCTACA 1620 

+ + ^ + + 

521 GASSSDAGVFNFYVAKDGST 

540 

1621 

AAAGAAGATCAATTAGTTGATGCATTAGCAGTAGGAGCTGTTGCTGGATATAAACTTGCT 1680 
4. + + + + 

541 KEDQLVDALAVGAVAGYKLA 

560 
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1681 

CCAGTTGTATTAGCTACTGATTCTTTATCTTCTGATCAATCGGTTGCTATAAGCAAAGTT 1740 
+ + + + + 

561PVVLATDSLSSDQSVAISKV 

580 

1741 

GTAGGAGAAAAATATTCTAAAGATTTAACACAAGTTGGTCAAGGAATAGCTAATTCAGTT 1800 

+ ^ + + + 

581 VGEKYSKDLTQVGQGIANSV 

600 

1801 ATAAACAAAATGAAAGATTTATTAGATATG 183 0 
+ + + 

601 I N K M K D L L D M 610 
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Appendix 3 

SEQ ID No. 5. Nucleotide sequence of slpA from Clostridium 
difficile strain 170324, PCR type 12, with translation. The 
putative secretory signal cleavage site (□) and site of cleavage to 

form the two mature SLPs (♦) are indicated. 
1 

ATGAATAAGAAAAATATAGCAATAGCTATGTCAGGTTTAACAGTTTTAGCTTCGGCTGCT 
60 

+ ^ + + + 

IMNKKNIAIAMSGLTVLASAA 

20 

61 

CCTGTTTTTGCTGCAACTACTGGAAC ACAAGGTTATACTGTAGTTAAAAACGACTGGAAA 120 

4. + + ^ + 

21PVFAATTGTQGYTVVKNDWK 

40 

□ 

121 

AAAGCAGTAAAACAATTACAAGATGGACTAAAAGATAATAGTATAGGAAAGATAACTGTA 180 

+ + + + + 

41KAVKQLQDGLKDNSIGKITV 

60 

181 

TCTTTTAATGATGGGGTTGTGGGTGAAGTAGCTCCTAAAAGTGCTAATAAGAAAGCGGAC 24 0 

+ + + + + 

61SFNDGVVGEVAPKSANKKAD 

80 

241 

AGAGATGCTGCAGCTGAGAAGTTATATAATCTTGTTAACACTCAATTAGATAAATTAGGT 3 00 
4. + + + + 

81RDAAAEKLYNLVNTQLDKLG 

100 

301 

GATGGAGATTATGTTGATTTTTCTGTAGATTATAATTTAGAAAACAAAATAATAACTAAT 3 60 

4. + ^ + + 

101 DGDYVDFSVDYNLENKI ITN 

120 

361 

CAAGCAGATGCAGAAGCAATTGTTACAAAGTTAAATTCACTTAATGAGAAAACTCTTATT 420 
+ + + ^ + 

I2IQADAEAIVTKLNSI1NEKTLI 

140 

421 

GATATAGCAACTAAAGATACTTTTGGAATGGTTAGTAAAACACAAGATAGTGAAGGTAAA 4 80 
4. + + + + 

141DIATKDTFGMVSKTQDSEGK 

160 
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481 

AATGTTGCTGCAACAAAGGCACTTAAAGTTAAAGATGTTGCTACATTTGGTTTGAAGTCT 540 

+ + + + + 

161 NVAATKALKVKDVATFGLKS 

180 

541 

GGTGGAAGCGAAGATACTGGATATGTTGTTGAAATGAAAGCAGGAGCTGTAGAGGATAAG 600 
181 GGSEDTGYVVEMKAGAVEDK 

200 

601 

TATGGTAAAGTTGGAGATAGTACGGCAGGTATTGCAATAAATCTTCCTAGTACTGGACTT 660 

^ ^ + + + 

201 YGKVGDSTAGIAINLPSTGL 

220 

661 

GAATATGCAGGTAAAGGAACAACAATTGATTTTAATAAAACTTTAAAAGTTGATGTAACA 720 

+ ^ ^ ^ ^ 

221 EYAGKGTTIDFNKTLKVDVT 

240 

721 

GGTGGTTCAACACCTAGTGCTGTAGCTGTAAGTGGTTTTGTAACTAAAGATGATACTGAT 7 80 

4. + + 4. + 

241 GGSTPSAVAVSGFVTKDDTD 

260 

781 

TTAGC AAAATC AGGTACT ATAAATGTAAGAGTTATAAATGCAAAAGAAGAATCAATTGAT 840 

+ + + + + 

261 LAKSGTINVRVINAKEESID 

280 

841 

ATAGATGCAAGCTCATATACATCAGCTGAAAATTTAGCTAAAAGATATGTATTTGATCCA 900 

+ ^ + + + 

281 IDASSYTSAENLAKRYVFDP 

300 

901 

GATGAAATTTCTGAAGCATATAAGGCAATAGTAGCATTACAAAATGATGGTATAGAGTCT 960 

4. + + + + 

301 DEISEAYKAIVALQNDGIES 

320 

961 

AACTTAGTTCAGTTAGTTAATGGAAAATATCAAGTGATTTTTTATCCAGAAGGTAAAAGA 1020 
+ 4 ^ ^ ^ 

321NLVQLVNGKYQVIFYPEGKR 

340 

1021 

TTAGAAACTAAATCAGCAAATGATACAATAGCTAGTCAAGATACACCAGCTAAAGTAGTT 10 8 0 
+ 4 ^ 4 ^ 

341 LETKSANDTIASQDTPAKVV 

360 

♦ 

1081 

ATAAAAGCTAATAAATTAAAAGATTTAAAAGATTATGTAGATGATTTAAAAACATATAAT 114 0 
4. 4. 4. 4. ^ 
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361 IKANKLKDLKDYVDDLKTYN 

380 

1141 

AATACTTATTCAAATGTTGTA?\CAGTAGCAGGAGAAGATAGAATAGAAACTGCTATAGAA 1200 
381 NTYSNVVTVAGEDRIETAIE 

400 

1201 

TTAAGTAGTAAATATTATAATTCTGATGATAAAAATGCAATAACTGATAAAGCAGTTAAT 1260 
+ + + + + 

401 LSSKYYNSDDKNAITDKAVN 

420 

1261 

GATATAGTATTAGTTGGATCTACATCTATAGTTGATGGTCTTGTTGCATCACCATTAGCT 1320 
+ ^ 4. + + 

421 DIVLVGSTSIVDGLVASPLA 

440 

1321 

TCAGAAAAAACAGCTCCATTATTATTAACTTCAAAAGATAAATTAGATTCATCAGTAAAA 1380 
+ + + + + 

441 SEKTAPLLLTSKDKLDSSVK 

460 

1381 

TCTGAAATAAAGAGAGTTATGAACTTAAAGAGTGACACTGGTATAAATACTTCTAAAAAA 1440 
^ + ^ + + 

461 SEIKRVMNLKSDTGINTSKK 

480 

1441 

GTTTATTTAGCTGGTGGAGTTAATTCTATATCTAAAGATGTAGAAAATGAATTGAAAAAC 1500 

+ + + + + 

481 VYLAGGVNSISKDVENEIiKN 

500 

1501 

ATGGGTCTTAAAGTTACTAGATTATCAGGAGAAGACAGATACGAAACTTCTTTAGCAATA 1560 

^ ^ + + + 

501 MGLKVTRLSGEDRYETSLAI 

520 

1561 

GCTGATGAAATAGGTCTTGATAATGATAAAGCATTTGTAGTTGGTGGTACTGGATTAGCA 1620 
^ ^ ^ + 4. 

521 ADEIGLDNDKAFVVGGTGLA 

540 

1621 

GATGCTATGAGTATAGCTCCAGTTGCTTCTCAACTTAAAGATGGAGATGCTACTCCAATA 1680 
+ ^ ^ ^ + 

541 DAMSIAPVASQLKDGDATPI 

560 



1681 

GTAGTTGTAGATGGAAAAGCAAAAGAAATAAGTGATGATGCTAAGAGTTTCTTAGGAACT 17 4 0 
+ ^ ^ + ^ 

561 VVVDGKAKEISDDAKSFLGT 

580 

1741 

TCTGATGTTGATATAATAGGTGGAAAAAATAGCGTATCTAAAGAGATTGAAGAGTCAATA 1800 
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581SDVDIIGGKNSVSKEIEESI 

600 

1801 

GATAGTGCAACTGGAAAAACTCCAGATAGAATAAGTGGAGATGATAGACAAGCAACTAAT 1860 
601 DSATGKTPDRISGDDRQATN 

620 

1861 

GCTGAAGTTTTAAAAGAAGATGATTATTTC ACAGATGGTGAAGTTGTGAATTACTTTGTT 192 0 
+ ^ 4. + + 

621 AEVLKEDDYFTDGEVVNYFV 

640 

1921 

GCAAAAGATGGTTCTACTAAAGAAGATCAATTAGTAGATGCCTTAGCAGCAGCACCAATA 1980 
+ + + + + 

641 AKDGSTKEDQLVDALAAAPI 

660 

1981 

GCAGGTAGATTTAAGGAGTCTCCAGCTCCAATCATACTAGCTACTGATACTTTATCTTCT 204 0 
+ + + + + 

661 AGRFKESPAPIILATDTLSS 

680 

2041 

GACCAAAATGTAGCTGTAAGTAAAGCAGTTCCTAAAGATGGTGGAACTAACTTAGTTCAA 2100 
+ + + + + 

681DQNVAVSKAVPKDGGTNLVQ 

700 

2101 GTAGGTAAAGGTATAGCTTCTTCAGTTATAAACAAAATGAAAGATTTATTAGATATG 
2157 

4. + + + + 

701 VGKGIASSVINKMKDLLDM 

719 
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Appendix 4 

SEQ ID No 6. Nucleotide sequence of slpA from Clostridium 

difficile strain 171448; PGR type 12, with translation. The 
putative secretory signal cleavage site (□) and site of cleavage to 

form the two mature SLPs (♦) are indicated. 
X 

ATGAATAAGAAAAATATAGCAATAGCTATGTCAGGTTTAACAGTTTTAGCTTCGGCTGCT 6 0 

^ ^ + ^ ^ 

IMNKKNIAIAMSGLTVLASAA 

20 

61 

CCTGTTTTTGCTGCAACTACTGGAACACAAGGTTATACTGTAGTTAAAAACGACTGGAAA 12 0 

+ 4. ^ + + 

21PVFAATTGTQGyTVVKNDWK 

40 

□ 

121 

AAAGCAGTAAAACAATTACAAGATGGACTAAAAGATAATAGTATAGGAAAGATAACTGTA 180 
+ ^ + + + 

41KAVKQLQDGLKDNSIGKITV 

60 

181 

TCTTTTAATGATGGGGTTGTGGGTGAAGTAGCTCCTAAAAGTGCTAATAAGAAAGCGGAC 24 0 

+ ^ ^ ^ ^ 

61SFNDGVVGEVAPKSANKKAD 

80 

241 

AGAGATGCTGCAGCTGAGAAGTTATATAATCTTGTTAACACTCAATTAGATAAATTAGGT 300 

+ ^ + + 4. 

SIRDAAAEKLYNLVNTQLDKLG 

100 

301 

GATGGAGATTATGTTGATTTTTCTGTAGATTATAATTTAGAAAACAAAATAATAACTAAT 360 

4. ^ ^ ^ 4. 

101 DGDYVDFSVDYNLENKI ITN 

120 

361 

CAAGCAGATGCAGAAGCAATTGTTACAAAGTTAAATTCACTTAATGAGAAAACTCTTATT 420 

+ ^ ^ ^ ^ 

121 QADAEAIVTKLNSLNEKTLI 

140 

421 

GATATAGCAACTAAAGATACTTTTGGAATGGTTAGTAAAACACAAGATAGTGGAGGTAAA 4 80 

+ + + + 4. 

141 DIATKDTFGMVSKTQDSGGK 

160 
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481 

AATGTTGCTGCAACAAAGGCACTTAAAGTTAAAGATGTTGCTACATTTGGTTTGAAGTCT 540 
+ 4. + 4. ^ 

161 NVAATKALKVKDVATFGLKS 

180 

541 

GGTGGAAGCGAAGATACTGGATATGTTGTTGAAATGAAAGCAGGAGCTGTAGAGGATAAG 600 
+ + + + + 

181GGSEDTGYVVEMKAGAVEDK 

200 

601 

TATGGTAAAGTTGGAGATAGTACGGCAGGTATTGCAATAAATCTTCCTAGTACTGGACTT 660 

4. + + + + 

201 YGKVGDSTAGIAINLPSTGIi 

220 

661 

GAATATGCAGGTAAAGGAACAACAATTGATTTTAATAAAACTTTAAAAGTTGATGTAACA 720 

4. ^ + 4, + 

221 EYAGKGTTIDFNKTLiKVDVT 

240 

721 

GGTGGTTCAACACCTAGTGCTGTAGCTGTAAGTGGTTTTGTAACTAAAGATGATACTGAT 7 8 0 

4. + + + + 

241 GGSTPSAVAVSGFVTKDDTD 

260 

781 

TTAGCAAAATCAGGTACTATAAATGTAAGAGTTATAAATGCAAAAGAAGAATCAATTGAT 840 

+ + + + + 

261 LAKSGTIWVRVINAKEES ID 

280 

841 

ATAGATGCAAGCTCATATACATCAGCTGAAAATTTAGCTAAAAGATATGTATTTGATCCA 900 

4. + 4. + + 

281 IDASSYTSAENLAKRYVFDP 

300 

901 

GATGAAATTTCTGAAGCATATAAGGCAATAGTAGCATTACAAAATGATGGTATAGAGTCT 960 

+ 4. 4. 4. 4. 

301 DEISEAYKAIVALQNDGIES 

320 

961 

AATTTAGTTCAGTTAGTTAATGGAAAATATCAAGTGATTTTTTATCCAGAAGGTAAAAGA 102 0 
4. 4. 4. 4. 4. 

321NLVQLVNGKYQVIFYPEGKR 

340 

1021 

TTAGAAACTAAATCAGCAAATGATACAATAGCTAGTCAAGATACACCAGCTAAAGTAGTT 1080 
4. 4. 4. 4. 4. 

341 LETKSANDTIASQDTPAKVV 

360 

♦ 

1081 

ATAAAAGCTAATAAATTAAAAGATTTAAAAGATTATGTAGATGATTTAAAAACATATAAT 1140 
4. 4. 4. 4. 4. 
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361 IKANKLKDLKDYVDDLKTYN 

380 

1141 

AATACTTATTCAAATGTTGTAACAGTAGCAGGAGAAGATAGAATAGAAACTGCTATAGAA 12 00 
381 NTYSNVVTVAGEDRIETAIE 

400 

1201 

TTAAGTAGTAAATATTATAATTCTGATGATAAAAATGCAATAACTGATAAAGCAGTTAAT 12 60 
+ ^ ^ + ^ 

401 LSSKYYNSDDKNAITDKAVN 

420 

1261 

GATATAGTATTAGTTGGATCTACATCTATAGTTGATGGTCTTGTTGCATCACCATTAGCT 1320 
+ + + + + 

421 DIVLVGSTSIVDGLVASPLA 

440 

1321 

TCAGAAAAAACAGCTCCATTATTATTAGCTTCAAAAGATAAATTAGATTCATCAGTAAAA 13 80 
+ + + 4. + 

441 SEKTAPLLLASKDKLDSSVK 

460 

1381 

TCTGAAATAAAGAGAGTTATGAACTTAAAGAGTGACACTGGTATAAATACTTCTAAAAAA 1440 
+ + + + + 

461 SEIKRVMNLKSDTGINTSKK 

480 

1441 

GTTTATTTAGCTGGTGGAGTTAATTCTATATCTAAAGATGTAGAAAATGAATTGAAAAAC 1500 

^ + ^ + + 

481 VYLAGGVNSISKDVENELKN 

500 

1501 

ATGGGTCTTAAAGTTACTAGATTATCAGGAGAAGACAGATACGAAACTTCTTTAGCAATA 1560 

+ 4. ^ ^ + 

501 MGLKVTRLSGEDRYETSLAI 

520 

1561 

GCTGATGAAATAGGTCTTGATAATGATAAAGCATTTGTAGTTGGTGGTACTGGATTAGCA 162 0 
^ 4. 4. 4. + 

521 ADEIGLDNDKAFVVGGTGLA 

540 

1621 

GATGCTATGAGTATAGCTCCAGTTGCTTCTCAACTTAAAGATGGAGATGCTACTCCAATA 1680 
+ 4. 4. + 4. 

541 DAMSIAPVASQLKDGDATPI 

560 



1681 

GTAGTTGTAGATGGAAAAGCAAAAGAAATAAGTGATGATGCTAAGAGTTTCTTAGGAACT 174 0 
4. 4. 4. 4. 4. 

561 VVVDGKAKEISDDAKSFLGT 

580 

1741 

TCTGATGTTGATATAATAGGTGGAAAAAATAGCGTATCTAAAGAGATTGAAGAGTCAATA 1800 
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+ 4- + + + 

581 SDVDIIGGKNSVSKEIEESI 

600 

1801 

GATAGTGCAACTGGAAAAACTCCAGATAGAATAAGTGGAGATGATAGACAAGCAACTAAT 1860 

+ + ^ + + 

601 DSATGKTPDRISGDDRQATN 

620 

1861 

GCTGAAGTTTTAAAAGAAGATGATTATTTCACAGATGGTGAAGTTGTGAATTACTTTGTT 1920 
+ ^ ^ + + 

621 AEVLKEDDYFTDGEVVNYFV 

640 

1921 

GCAAAAGATGGTTCTACTAAAGAAGATCAATTAGTAGATGCCTTAGCAGCAGCACCAATA 1980 
4. + + + + 

641 AKDGSTKEDQLVDALAAAPI 

660 

1981 

GCAGGTAGATTTAAGGAGTCTCCAGCTCCAATCATACTAGCTACTGATACTTTATCTTCT 2 040 

+ + + + + 

661 AGRFKESPAPIILATDTLSS 

680 

2041 

GACCAAAATGTAGCTGTAAGTAAAGCAGTTCCTAAAGATGGTGGAACTAACTTAGTTCAA 2100 

+ + + + + 

681DQNVAVSKAVPKDGGTNLVQ 

700 

2101 GTAGGTAAAGGTATAGCTTCTTCAGTTATAAACAAAATGAAAGATTTATTAGATATG 
2157 

4. 4. 4- 4. + 

701 VGKGIASSVINKMKDLLDM 

719 
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Appendix 5 

SEQ ID No. 7. Nucleotide sequence of slpA from Clostridium 
difficile strain 171862, PGR type 17, with translation. The 
putative secretory signal cleavage site (□) and site of cleavage to 

form the two mature SLPs (♦) are indicated. 
1 

ATGAATAAGAAAAACTTAGCAATGGCTATGGCAGCAGTTACTGTTGTGGGTTCTGCAGCG 6 0 

+ + ^ + + 

IMNKKNLAMAMAAVTVVGSAA 

20 

61 

CCAATATTTGCAGATAGTACTACGCCAGGTTATACTGTAGTGAAAAATGATTGGAAAAAA 120 

+ + ^ + + 

21PIFADSTTPGYTVVKNDWKK 

40 

C 

121 

GCAGTAAAACAATTACAAGATGGGTTGAAAAATAAAACTATATCAACAATAAAGGTGTCT 180 

^ + ^ + + 

41AVKQLQDGIiKNKTISTIKVS 

60 

181 

TTTAATGGAAACTCTGTTGGAGAAGTTACACCAGCCAGTTCTGGAGCAAAAAAAGCAGAT 24 0 

+ + + + + 

61FNGNSVGEVTPASSGAKKAD 

80 

241 

AGAGATGCTGCAGCTGAAAAGTTATATAATTTAGTAAATACACAATTAGATAAACTAGGT 300 



81RDAAAEKLYNLVNTQLDKLG 

100 

301 

GATGGAGATTACGTTGACTTTGAAGTAACTTATAATTTAGCTACTCAAATAATTACAAAA 360 

+ + ^ 4- + 

101 DGDYVDFEVTYNLATQIITK 

120 

361 

GCAGAAGCAGAGGCAGTTCTTACAAAATTACAACAATATAATGATAAAGTACTTATAAAT 420 

+ + + + + 

121AEAEAVLTKLQQYNDKVLIN 

140 

421 

TCTGCAACAGATACAGTAAAAGGTATGGTATCTGATACACAAGTTGATAGCAAAAATGTT 4 80 

+ + + + + 

141 SATDTVKGMVSDTQVDSKNV 

160 
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481 

GCAGCTAACCCACTTAAAGTTAGTGATATGTATACAATACCATCTGCTATTACTGGAAGT 540 

+ + + 4. + 

ISIAANPLKVSDMYTIPSAITGS 

180 

541 

GATGATTCTGGGTATAGTATTGCTAAACCAACAGAAAAGACTACAaGTTTATTGTATGGT 600 
4. + + + + 

181 DDSGYSIAKPTEKTTSLLYG 

200 

601 

ACGGTTGGTGATGCAACTGCAGGTAAAGCAATAACAGTAGATACAGCTTCAAATGAAGCT 660 

4. + 4. 4- 4. 

201 TVGDATAGKAITVDTASNEA 

220 

661 

TTTGCTGGAAATGGAAAGGTTATTGACTACAATAAATCATTCAAAGCAACTGTACAAGGA 72 0 

4. 4. 4. 4. 4- 

221 FAGNGKVIDYNKSFKATVQG 

240 

721 

GATGGAACAGTTAAGACAAGCGGGGTTGTACTTAAAGATGCAAGTGATATGGCTGCAACA 780 

4. 4. 4. 4. 4. 

241 DGTVKTSGVVLKDASDMAAT 

260 

781 

GGTACTATAAAAGTTAGAGTTACAAGTGCAAAAGAAGAATCTATTGATGTGGATTCAAGT 84 0 

4. 4. 4. 4. 4. 

261 GTIKVRVTSAKEESIDVDSS 

280 

841 

TCATATATTAGTGCTGAAAATTTAGCTAAAAAATATGTATTTAATCCTAAAGAGGTTTCT 900 

4. 4_ 4. 4. 4. 

281SYISAENLAKKYVFNPKEVS 

300 

901 

GAAGCTTATAATGCAATAGTTGCATTACAAAATGATGGAATAGAATCTGATTTAGTACAA 960 

4. 4. 4. 4. 4- 

301 EAYNAIVALQNDGIESDLVQ 

320 

961 

TTAGTTAATGGAAAATATCAAGTTATTTTCTATCCAGAAGGAAAAAGATTAGAAACTAAA 102 0 

4. 4. 4. 4. 4. 

321LVNGKYQVIFYPEGKRLETK 

340 
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1021 TCTGCAGATATAATAGCTGATGCAGATAGTCCAGCTAAAATAACTATAAAAGCTAATAAA 
1080 

^ + ^ ^ ^ 

341 SADIIADADSPAKITIKANK 

360 

♦ 

1081 

TTAAAAGATTTAAAAGATTATGTAGATGATTTAAAAACATACAATAATACTTACTCAAAT 1140 
+ ^ ^ + + 

361 LKDLKDYVDDLKTYNNTYSN 

380 

1141 

GTTGTAACAGTAGCAGGAGAAGATAGAATAGAAACTGCTATAGAATTAAGTAGTAAATAT 1200 
+ + + + + 

381 VVTVAGEDRIETAIELSSKY 

400 

1201 

TATAATTCTGATGATAAAAATGCAATAACTGATGATGCAGTTAATAATATAGTATTAGTT 1260 
+ + + + + 

401 YNSDDKNAITDDAVNNIVLV 

420 

1261 

GGATCTACATCTATAGTTGATGGTCTTGTTGCATCACCATTAGCTTCAGAAAAAACAGCT 132 0 
+ + 4. + + 

421 GSTSIVDGLVASPLASEKTA 

440 

1321 

CCATTATTATTAACTTCAAAAGATAAATTAGATTCATCAGTAAAATCTGAGATAAAAAGA 13 80 
+ + + + + 

441 PLIiLTSKDKLDSSVKSEIKR 

460 

1381 

GTTATGAACTTAAAGAGTGATACTGGTATAAATACTTCTAAAAAAGTTTATTTAGCTGGT 1440 
+ + + + + 

461VMNLKSDTGINTSKKVYLAG 

480 

1441 

GGAGTTAATTCTATATCTAAAGATGTAGAAGATGAATTGAAAAATATGGGCCTTAAAGTT 150 0 
+ ^ + + + 

481 GVNSISKDVEDELKNMGLKV 

500 

1501 

ACTAGATTATCAGGAGAAGACAGATACGAAACTTCTTTAGCAATAGCTGATGAAATAGGT 1560 
+ + + ^ + 

501 TRLSGEDRYETSLAIADEIG 

520 

1561 

CTTGATAATGATAAAGCATTTGTAGTTGGTGGTACTGGATTGGCAGATGCTATGAGTATA 1620 
+ ^ ^ ^ ^ 

521 LDNDKAFVVGGTGLADAMS I 

540 



1621 

GCTCCAGTTGCTTCTCAACTTAAAGATGGAGATGCTACTCCAATAGTAGTTGTAGATGGA 168 0 
+ 4. + 4. + 
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541APVASQLKDGDATPIVVVDG 

560 

1681 

AAAGCAAAAGAAATAAGTGATGATGCTAAGAGTTTCTTAGGAACTTCTGATGTTGATATA 1740 

+ + + + + 

561KAKEISDDAKSFLGTSDVDI 

580 

1741 

ATAGGTGGAAAAAATAGCGTATCTAAAGAGATTGAAGAGTCAATAGATAGTGCAACTGGA 1800 
+ + + + + 

581 IGGKNSVSKEIEESIDSATG 

600 

1801 

AAAACTCCAGATAGAATAAGTGGAGATGACAGACAAGCAACTAATGCTGAAGTTTTAAAA 1860 
+ + + + + 

601 KTPDRISGDDRQATNAEVLK 

620 

1861 

GAAGATGATTATTTCAAAGATGGTGAAGTTGTGAATTACTTTGTTGCAAAAGATGGTTCT 1920 

+ + + + + 

621 EDDYFKDGEVVNYFVAKDGS 

640 

1921 

ACTAAAGAAGATCAATTAGTAGATGCATTAGCAGCAGCACCAATAGCAGGTAGATTTAAG 1980 
+ + + + + 

641 TKEDQLVDALAAAPIAGRFK 

660 

1981 

GAGTCTCCAGCTCCAATCATACTAGCTACTGATACTTTATCTTCTGACCAAAATGTAGCT 2040 

^ + + + + 

661 ESPAPIILATDTLSSDQNVA 

680 

2041 

GTAAGTAAAGCAGTTCCTAAAGATGGTGGAACTAACTTAGTTCAAGTAGGTAAAGGTATA 2100 

^ ^ + + + 

681 VSKAVPKDGGTNLVQVGKGI 

700 

2101 GCTTCTTCAGTTATAAACAAAATGAAAGATTTATTAGATATGTAA 2145 
+ ^ + + 

701 ASSVINKMKDLLDM* 715 
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Appendix 6 

SEQ ID No 8 . Nucleotide sequence of slpA from Clostridium 

difficile strain 173644; PGR type 31, with translation. The 
putative secretory signal cleavage site (□) and site of cleavage to 

form the two mature SLPs (♦) are indicated. 
1 

ATGAATAAGAAGGATATAGCAATAGCTATGTCAGG-ATTAACAGTATTAGCTTCTGCAGCA 6 0 

^ + + + + 

IMNKKDIAIAMSGLTVLASAA 

20 

61 

CCTGTATTTGCTGCTAGTAGTTTTACAGCAGATTATAATTATACTGTAGTGCAAGGAAAA 12 0 

^ + + + + 

21PVFAASSFTADYNYTVVQGK 

40 

C 

121 

TATCAAAAAGTTATAACTGGATTACAAGATGGTTTAAAAAATGGAAAAATAACAAATATT 180 
+ + + + + 

41YQKVITGLQDGLKNGKITNI 

60 

181 

GATGTAATATTTGATGGAAGTTCAATTGGTGAGGTAGTGCCAGGTTCTGATGCTGCAGCT 240 

^ + 4. + + 

61DVIFDGSSIGEVVPGSDAAA 

80 

241 

GCAGCTACTAAATTAAAAAGTTTAGTTGATGATAAGTTAGATAACTTAGGTGATGGAAAA 3 00 

+ + + + 4. 

81AATKLKSLVDDKLDNLGDGK 

100 

301 

TACGTTCAATTTAATGTTACTTATACTACTAAATCTATAATAACTAAAGCAGAATTAAAA 36 0 

^ + + + + 

101 YVQFNVTYTTKSIITKAELK 

120 

361 

AATTATTATAATCAATTAGAAAGTAGTAAAGATAGAATACTTATAGGAAATGAACCTCAA 420 

+ + + + + 

121 NYYNQLESSKDRILXGNEPQ 

140 

421 

GATACAGGAACTAAAGGTCTTATAAAAGCTGATACTGATGGTACTACTGCTGTTGCAGCA 480 
+ + + + + 

141 DTGTKGLIKADTDGTTAVAA 

160 



481 

GCTGCACCATTGAAATTATC AGATATATTTACGTTTAGTTATGATGAAGTAACAGGTGTA 540 

^ 4. + 4. + 

161 AAPLKLSDIFTFSYDEVTGV 

180 
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541 

CTTAAAGCAGAACCAACAAGTAAAGTAAGCGCTGGTAAAGTTCAAGGTCTAAAATATGGA 600 
^ + ^ 4. ^ 

181 LKAEPTSKVSAGKVQGLKYG 

200 

601 

AATACAGGAGCAACTAACTATACTTCTGGAGCTGAAATATCTGTTCCTACTACAGGCTTA 66 0 
+ ^ ^ ^ ^ 

201 NTGATNYTSGAEISVPTTGL 

220 

661 

ACATTAACTGCTGATACAACTGCAACAACAGATGTAAATATTTCTGATGTTATGAGTGCA 72 0 
+ ^ 4. + + 

221 TLTADTTATTDVNISDVMSA 

240 

721 

TTTAAATTTAATGGTACTGATACGATTAGTGGATTCCCAGCTGGTTCATCAGCTTCTACT 7 8 0 

+ + ^ + ^ 

241 FKPNGTDTISGFPAGSSAST 

260 

781 

CTTAGAGCAAGTATAAAAGTAATAAATGCAAAAGAAGAATCTATAGATGTTGATTCAAGT 840 

+ ^ + 4. 4. 

261 LRASIKVINAKEESIDVDSS 

280 

841 

TCACATAGAACAGCTGAAGATTTAGCTGAAAAATATGTATTTAAACCAGAAGATGTGAAT 900 

4. + + + + 

281 SHRTAEDLAEKYVFKPEDVN 

300 

901 

AAAACTTATGAGGCACTGACTGATTTATATAAAGAAGGTATAACAAGTAATCTTATCACT 960 

+ 4. + 4. + 

301 KTYEALTDLYKEGITSNLIT 

320 

961 

CAAGATGGTGGAAAATATCAAGTTGTTTTATTTGCTC AAGGAAAGAGATTAACTACTAAA 1020 
4. 4. + + + 

321 QDGGKYQVVLFAQGKRLTTK 

340 

1021 

GGAGCAACTGGAACTTTAGCAGATGAAAATTCTCCTCTTAAAGTAACAATAAAAGCAGAT 1080 
4. 4. ^ 4. 4. 

341 GATGTLADENSPLKVTIKAD 

360 

♦ 

1081 

AAAGTAAAAGACTTAAAAGATTATGTTGAAGATTTAAAAAATGCTAAC AATGGATATTCA 1140 
+ + 4. + + 

361 KVKDLKDYVEDLKNANNGYS 

380 

1141 

AATTCTGTTGTTGTA6CAGGTGAAGATAGAATAGAAACAGCAATAGAGTTAAGTAGCAAA 1200 
4. 4. + 4. 4, 
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381 NSVVVAGEDRIETAIELSSK 

400 

1201 

TACTATAACTCTGATGATGACAATGCAATAACTAAAGATCCAGTTAACAATGTTGTTTTA 1260 
^ ^ + + + 

401 YYNSDDDNAITKDPVNNVVL 

420 

1261 

GTTGGTTCTCAAGCTGTAGTTGAT6GGCTTGTAGCTTCACCTTTAGCATCTGAAAAAAGA 1320 
+ + + + + 

421 VGSQAVVDGLVASPLASEKR 

440 

1321 

GCTCCTTTACTATTAACTTCAGCAGGAAAATTAGATTCAAGTGTTAAAGCTGAGTTGAAA 13 80 
+ + + + + 

441APLLLTSAGKLDSSVKAELK 

460 

1381 

AGAGTAATGGATTTAAAATCTACAACAGGTGTAAATACTTCTAAAAAAGTTTACTTAGCT 1440 
+ + + + 4. 

461 RVMDLKSTTGVNTSKKVYLA 

480 

1441 

GGTGGAGTAAACTCTATATCTAAAGATGTAGAAAATGAATTAAAAGATATGGGACTTAAA 1500 
^ + + + 4. 

481 GGVNSISKDVENELKDMGLK 

500 

1501 

GTTACAAGATTATCAGGAGATGATAGATATGAAACTTCTTTAGCTATAGCTGATGAAATA 1560 

+ 4. + + + 

501VTRLSGDDRYETSLAIADEI 

520 

1561 

GGTCTTGATAATGATAAAGCTTTTGTAGTTGGAGGAACAGGATTAGCGGATGCTATGAGT 1620 
+ ^ 4. + + 

521 GLDNDKAFVVGGTGLADAMS 

540 

1621 

ATAGCTCCAGTTGCTTCTCAATTAAGAAACTCAAATGGAGAACTTGACTTAAAAGGTGAT 168 0 
^ + + + 

541 lAPVASQLRNSNGELDLKGD 

560 



1681 

GCAACTCCAATAGTAGTTGTTGATGGAAAAGCTAAAGATATAAATTCTGAAGTAAAAGAT 1740 
4. + + + + 

561 ATPIVVVDGKAKDINSEVKD 

580 

1741 

TTCTTAGATGATTCACAAGTTGATATAATAGGTGGTGTAAATAGTGTTTCTAAAGAAGTA 1800 
+ + + + + 

581 FLDDSQVDIIGGVNSVSKEV 

600 

1801 

ATGGAAGCAATAGATGATGCTACTGGAAAATCACCTGAGAGATATAGTGGAGAAGATAGA 1860 



TRI002/C 



+ + + + + 

601 MEAIDDATGKSPERYSGEDR 

620 

1861 

CAAGCAACAAATGCTAAAGTTATAAAAGAAGATGATTTCTTTAAAAATGGAGAAGTTACA 192 0 

+ 4. ^ + + 

621 QATNAKVIKEDDFFKNGEVT 

640 

1921 

AACTTCTTTGTAGCTAAAGATGGTTCAACTAAAGAAGATCAATTAGTAGATGCTTTAGCA 1980 

+ ^ + + + 

641 NFFVAKDGSTKEDQLVDALA 

660 

1981 

GGTGCTGCAATTGCTGGTAACTTTGGTGTAACAGTAGATAATGAAGGAAAACCTACAGTT 2 04 0 

+ + + + + 

661 GAAIAGNFGVTVDNEGKPTV 

680 

2041 

GCTGATAAAAAAGCTTCTCCAGCACCAATTGTTTTAGCAACAGATTCTTTATCTTCTGAT 2100 

+ + + 4. + 

681ADKKASPAPIVLATDSLSSD 

700 

2101 

CAAAATGTAGCTATAAGTAAAGCTGTAAATGATGACGCTAATACTAAGAATCTAGTTCAA 2160 

+ + + + + 

701QNVAISKAVNDDANTKNLVQ 

720 

2161 GTTGGTAAAGGTATAGCTACTTCAGTTGTAAGTAAAATAAAAGATTTATTAGATATG 
2217 

+ ^ + + + 

721 VGKGIATSVVSKIKDLLDM 

739 
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Appendix 7 

SEQ ID No 9. Nucleotide sequence of slpA from Clostridium difficile 
strain 170444, PCR type 46, with translation. The putative 
secretory signal cleavage site {□) and site of cleavage to form the 

two mature SLPs {♦) are indicated. 

1 

ATGAATAAGAAAAATATAGCAATAGCTATGTCAGGTTTAACAGTTTTAGCTTCGGCTGCT 60 

+ + + + + 

IMNKKNIAIAMSGLTVLASAA 

20 

61 

CCTGTTTTTGCTGCAACTACTGGAACACAAGGTTATACTGTAGTTAAAAACGACTGGAAA 120 

+ + + + + 

21PVFAATTGTQGYTVVKNDWK 

40 

□ 

121 

AAAGCAGTAAAACAATTACAAGATGGACTAAAAGATAATAGTATAGGAAAGATAACTGTA 18 0 

+ + + + + 

41KAVKQLQDGLKDNSIGKITV 

60 

181 

TCTTTTAATGATGGGGTTGTGGGTGAAGTAGCTCCTAAAAGTGCTAATAAGAAAGCGGAC 240 

+ + + + + 

61SFNDGVVGEVAPKSANKKAD 

80 

241 

AGAGATGCTGCAGCTGAGAAGTTATATAATCTTGTTAAC ACTCAATTAGATAAATTAGGT 3 00 

+ + + + + 

81RDAAAEKLYNLVNTQLDKLG 

100 

301 

GATGGAGATTATGTTGATTTTTCTGTAGATTATAATTTAGAAAAAAAAATAATAACTAAT 360 

+ + + + + 

101 DGDYVDFSVDYNLEKKI ITN 

120 

361 

CAAGCAGATGCAGAAGCAATTGTTACAAAGTTAAATTCACTTAATGAGAAAACTCTTATT 420 

^ + + + + 

121 QADAEAIVTKLNSLNEKTLI 

140 

421 

GATATAGCAACTAAAGATACTTTTGGAATGGTTAGTAAAACACAAGATAGTGAAGGTAAA 4 80 

+ + + + + 

141 DIATKDTFGMVSKTQDSEGK 

160 
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481 

AATGTTGCTGCAACAAAGGCACTTAAAGTTAAAGATGTTGCTACATTTGGTTTGAAGTCT 540 
+ + ^ + + 

161 NVAATKALKVKDVATFGLKS 

180 

541 

GGTGGAAGCGAAGATACTGGATATGTTATTGAAATGAAAGCAGGAGCTGTAGAGGATAAG 600 

_,. ^ ^ ^ ^ 

181GGSEDTGYVIEMKAGAVEDK 

200 

601 

TATGGTAAAGTTGGAGATAGTACGGCAGGTATTGCAATAAATCTTCCTAGTACTGGACTT 660 

^ + ^ ^ + 

201 YGKVGDSTAGIAINLPSTGL 

220 

661 

GAATATGCAGGTAAAGGAACAACAATTGATTTTAATAAAACTTTAAAAGTTGATGTAACA 72 0 

+ ^ ^ ^ ^ 

221 EYAGKGTTIDFNKTLKVDVT 

240 

721 

GGTGGTTCAACACCTAGTGCTGTAGCTGTAAGTGGTTTTGTAACTAA?VGATGATACTGAT 780 
+ + + + + 

241GGSTPSAVAVSGFVTKDDTD 

260 

781 

TTAGCAAAATCAGGTACTATAAATGTAAGAGTTATAAATGCAAAAGAAGAATCAATTGAT 840 

+ + + + + 

261 LAKSGTINVRVINAKEESID 

280 

841 

ATAGATGCAAGCTCATATACATCAGCTGAAAATTTAGCTAAAAGACATGTATTTGATCCA 900 

+ + ^ + + 

281 IDASSYTSAENLAKRHVFDP 

300 

901 

GATGAAATTTCTGAAGCATATAAGGCAATAGTAGCATTACAAAATGATGGTATAGAGTCT 960 

+ ^ + + ^ 

301 DEISEAYKAIVALQNDGIES 

320 

961 

AATTTAGTTCAGTTAGTTAATGGAAAATATCAAGTGATTTTTTATCCAGAAGGTAAAAGA 102 0 
+ ^ ^ + + 

321 NLVQLVNGKYQVIFYPEGKR 

340 
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1021 TTAGAAACTAAATCAGCAAATGATACAATAGCTAGTCAAGATACACCAGCTAAAGTAGTT 
1080 

^ ^ ^ + ^ 

341 LETKSANDTIASQDTPAKVV 

360 

♦ 

1081 

ATAAAAGCTAATAAATTAAAAGATTTA7WVGATTATGTAGATGATTTAAAAACATATAAT 1140 

+ ^ + ^ ^ 

361 IKANKLKDIiKDYVDDLKTYN 

380 

1141 

AATACTTATTCAAATGTTGTAACAGTAGCAGGAGAAGATAGAATAGAAACTGCTATAGAA 12 00 
+ + + + 4. 

381NTYSNVVTVAGEDRIETAIE 

400 

1201 

TTAAGTAGTAAATATTATAATTCTGATGATAAAAATGCAATAACTGATAAAGCAGTTAAT 1260 
+ + + + + 

401 LSSKYYNSDDK^fAITDKAVM 

420 

1261 

GATATAGTATTAGTTGGATCTACATCTATAGTTGATGGTCTTGTTGCATCACCATTAGCT 1320 
+ + ^ ^ + 

421 DIVLVGSTSIVDGLVASPLA 

440 

1321 

TCAGAAAAAACAGCTCCATTATTATTAACTTCAAAAGATAAATTAGATTCATCAGTAAAA 1380 
+ + + + + 

441 SEKTAPLLLTSKDKLDSSVK 

460 

1381 

TCTGAAATAAAGAGAGTTATGAACTTAAAGAGTGACACTGGTATAAATACTTCTAAAAAA 1440 
^ ^ ^ + ^ 

461 SEIKRVMNLKSDTGINTSKK 

480 

1441 

GTTTATTTAGCTGGTGGAGTTAATTCTATATCTAAAGATGTAGAAAATGAATTGAAAAAC 1500 
+ + + + + 

481 VYLAGGVNSISKDVENELKN 

500 

1501 

ATGGGTCTTAAAGTTACTAGATTATCAGGAGAAGACAGATACGAAACTTCTTTAGCAATA 1560 
+ ^ + ^ + 

501 MGLKVTRLSGEDRYETSLAI 

520 

1561 

GCTGATGAAATAGGTCTTGATAATGATAAAGCATTTGTAGTTGGTGGTACTGGATTAGCA 1620 
4. 4. 4. 4. 4, 

521 ADEIGLDNDKAFVVGGTGLA 

540 



1621 

GATGCTATGAGTATAGCTCCAGTTGCTTCTCAACTTAAAGATGGAGATGCTACTCCAATA 1680 
+ + 4. 4. 4. 
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541 DAMSIAPVASQLKDGDATPI 

560 

1681 

GTAGTTGTAGATGGAAAAGCAAAAGAAATAAGTGATGATGCTAAGAGTTTCTTAGGAACT 1740 
+ + + + + 

561 VVVDGKAKEISDDAKSFLGT 

580 

1741 

TCTGATGTTGATATAATAGGTGGAAAAAATAGCGTATCTAAAGAGATTGAAGAGTCAATA 1800 
+ + 4. 4. + 

581 SDVDIIGGKNSVSKEIEESI 

600 

1801 

GATAGTGCAACTGGAAAAACTCCAGATAGAATAAGTGGAGATGATAGACAAGCAACTAAT 1860 
+ + + + + 

601DSATGKTPDRISGDDRQATN 

620 

1861 

GCTGAAGTTTTAAAAGAAGATGATTATTTCACAGATGGTGAAGTTGTGAATTACTTTGTT 1920 
4. 4. + + + 

621AEVLKEDDYFTDGEVVNYFV 

640 

1921 

GCAAAAGATGGTTCTACTAAAGAAGATCAATTAGTAGATGCCTTAGCAGCAGCACCAATA 1980 

+ + 4. + + 

641 AKDGSTKEDQLVDALAAAPI 

660 

1981 

GCAGGTAGATTTAAGGAGTCTCCAGCTCCAATCATACTAGCTACTGATACTTTATCTTCT 204 0 

4. ^ + + + 

661 AGRFKESPAPIILATDTLSS 

680 

2041 

GACCAAAATGTAGCTGTAAGTAAAGCAGTTCCTAAAGATGGTGGAACTAACTTAGTTCAA 2100 

+ 4. + 4. + 

681 DQNVAVSKAVPKDGGTNLVQ 

700 

2101 GTAGGTAAAGGTATAGCTTCTTCAGTTATAAACAAAATGAAAGATTTATTAGATATG 
2157 

4. 4. 4, 4. + 

701 VGKGIASSVINKMKDLLDM 

719 
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Appendix 8 

SEQ ID No 10. Nucleotide sequence of slpA from Clostridium difficile 
strain 170426, PGR type 92, with translation. The putative 
5 secretory signal cleavage site (□) and site of cleavage to form the 

two mature SLPs {♦) are indicated. 
1 

ATGAATAAGAAAAATATAGCAATAGCTATGTCAGGTTTAACAGTTTTAGCTTCGGCTGCT 6 0 

10 + + + + + 

IMNKKNIAIAMSGLTVLASAA 

20 

61 

CCTGTTTTTGCTGC AACTACTGGAACAC AAGGTTATACTGTAGTTAAAAACGACTGGAAA 120 

15 + + + + + 

21PVFAATTGTQGYTVVKNDWK 

0 40 

0 D 

m 121 

m 20 AAAGCAGTAAAACAATTACAGGATGGACTAAAAGATAATAGTATAGGAAAGATAACTGTA 180 

^ + + + + 

yi 41KAVKQLQDGLKDNSIGKITV 

n 

181 

; 25 TCTTTTAATGATGGGGTTGTGGGTGAAGTAGCTCCTAAAAGTGCTAATAAGAAAGCGGAC 240 

p + + + + + 

fU 61SFNDGVVGEVAPKSANKKAD 

y, 80 

U 241 

;k 30 AGAGATGCTGCAGCTGAGAAGTTATATAATCTTGTTAACACTCAATTAGATAAATTAGGT 300 

+ + + + ^ 

SIRDAAAEKLYNLVNTQLDKLG 

100 

301 

35 GATGGAGATTATGTTGATTTTTCTGTAGATTATAATTTAGAAAAAAAAATAATAACTAAT 3 60 

^ 4. ^ + + 

101 DGDYVDFSVDYNLEKKIITN 

120 

361 

40 CAAGCAGATGCAGAAGCAATTGTTACAAAGTTAAATTCACTTAATGAGAAAACTCTTATT 420 

^ + + + 4. 

121QADAEAIVTKLNSLNEKTL1 

140 

421 

45 GATATAGCAACTAAAGATACTTTTGGAATGGTTAGTAAAACACAAGATAGTGAAGGTAAA 4 8 0 

+ + + + + 

+ 

141 DIATKDTFGMVSKTQDSEGK 

160 

50 

481 

AATGTTGCTGCAACAAAGGCACTTAAAGTTAAAGATGTTGCTACATTTGGTTTGAAGTCT 54 0 

^ 4. 4. + + 

55 16INVAATKALKVKDVATFGLKS 

180 
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541 

GGTGGAAGCGAAGATACTGGATATGTTGTTGAAATGAAAGCAGGAGCTGTAGAGGATAAG 600 

^ ^ 4. ^ ^ 

181 GGSEDTGYVVEMKAGAVEDK 

200 

601 

TATGGTAAAGTTGGAGATAGTACGGCAGGTATTGCAATAAATCTTCCTAGTACTGGACTT 660 

+ ^ + + + 

201 YGKVGDSTAGIAINLPSTGL 

220 

661 

GAATATGCAGGTAAAGGAACAACAATTGATTTTAATAAAACTTTAAAAGTTGATGTAACA 720 

+ + ^ ^ + 

221 EYAGKGTTIDFNKTLKVDVT 

240 

721 

GGTGGTTCAACACCTAGTGCTGTAGCTGTAAGTGGTTTTGTAACTAAAGATGATACTGAT 7 80 

+ ^ + ^ + 

241 GGSTPSAVAVSGFVTKDDTD 

260 

781 

TTAGCAAAATCAGGTACTATAAATGTAAGAGTTATAAATGCAAAAGAAGAATCAATTGAT 840 

+ ^ + + ^ 

261 LAKSGTINVRVINAKEES ID 

280 

841 

ATAGATGCAAGCTCATATACATCAGCTGAAAATTTAGCTAAAAGATATGTATTTGATCCA 900 

4. 4. + + + 

281 IDASSYTSAENLAKRYVFDP 

300 

901 

GATGAAATTTCTGAAGCATATAAGGCAATAGTAGCATTACAAAATGATGGTATAGAGTCT 960 

+ + + + + 

301 DEISEAYKAIVALQNDGIES 

320 

961 

AATTTAGTTCAGTTAGTTAATGGAAAATATCAAGTGATTTTTTATCCAGAAGGTAAAAGA 1020 
+ 4. + + + 

321 NLVQLVNGKYQVIFYPEGKR 

340 

1021 

TTAGAAACTAAATCAGCAAATGATACAATAGCTAGTCAAGATACACCAGCTAAAGTAGTT 1080 
341 LETKSANDTIASQDTPAKVV 

360 

♦ 

1081 

ATAAAAGCTAATAAATTAAAAGATTTAAAAGATTATGTAGATGATTTAAAAACATATAAT 1140 
+ + + ^ + 

361 IKANKLKDLKDYVDDLKTYN 

380 

1141 

AATACTTATTCAAATGTTGTAACAGTAGCAGGAGAAGATAGAATAGAAACTGCTATAGAA 1200 
+ + ^ + + 
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381NTYSNVVTVAGEDRIETAIE 

400 

1201 

TTAAGTAGTAAATATTATAATTCTGATGATAAAAATGCAATAACTGATAAAGCAGTTAAT 1260 
^ + + + + 

401 LSSKYYNSDDKNAITDKAVN 

420 

1261 

GATATAGTATTAGTTGGATCTACATCTATAGTTGATGGTCTTGTTGCATCACCATTAGCT 132 0 
4. 4. 4. 4. 4. 

421 DIVLVGSTSIVDGLVASPLA 

440 

1321 

TCAGAAAAAACAGCTCCATTATTATTAACTTCAAAAGATAAATTAGATTCATCAGTAAAA 13 80 
4. 4. 4. 4. 4. 

441 SEKTAPLLLTSKDKIiDSSVK 

460 

1381 

TCTGAAATAAAGAGAGTTATGAACTTAAAGAGTGACACTGGTATAAATACTTCTAAAAAA 1440 
4. 4. 4. + 4, 

461 SEIKRVMNLKSDTGINTSKK 

480 

1441 

GTTTATTTAGCTGGTGGAGTTAATTCTATATCTAAAGATGTAGAAAATGAATTGAAAAAC 1500 
4. 4. 4. 4. 4. 

481 VYLAGGVNS ISKDVENELKN 

500 

1501 

ATGGGTCTTAAAGTTACTAGATTATCAGGAGAAGACAGATACGAAACTTCTTTAGCAATA 1560 
4. 4. 4. 4. 4. 

501 MGLKVTRLSGEDRYETSLAI 

520 

1561 

GCTGATGAAATAGGTCTTGATAATGATAAAGCATTTGTAGTTGGTGGTACTGGATTAGCA 1620 
4. 4. 4. 4. 4. 

521ADEIGLDNDKAFVVGGTGLA 

540 

1621 

GATGCTATGAGTATAGCTCCAGTTGCTTCTCAACTTAAAGATGGAGATGCTACTCCAATA 1680 
4. 4. 4. 4. 4. 

541 DAMSIAPVASQLKDGDATPI 

560 



1681 

GTAGTTGTAGATGGAAAAGCAAAAGAAATAAGTGATGATGCTAAGAGTTTCTTAGGAACT 174 0 
4. 4. 4. 4. 4. 

561 VVVDGKAKEISDDAKSFLGT 

580 

1741 

TCTGATGTTGATATAATAGGTGGAAAAAATAGCGTATCTAAAGAGATTGAAGAGTCAATA 1800 
4. 4. 4. 4. 4. 

581 SDVDIIGGKNSVSKEIEESI 

600 

1801 

GATAGTGCAACTGGAAAAACTCCAGATAGAATAAGTGGAGATGATAGACAAGCAACTAAT 1860 
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+ + + + + 

601DSATGKTPDRISGDDRQATN 

620 

1861 

GCTGAAGTTTTAAAAGAAGATGATTATTTCACAGATGGTGAAGTTGTGAATTACTTTGTT 1920 
^ ^ 4. + + 

621 AEVLKEDDYFTDGEVVNYFV 

640 

1921 

GCAAAAGATGGTTCTACTAAAGAAGATCAATTAGTAGATGCCTTAGCAGCAGCACCAATA 1980 
+ ^ + + + 

641 AKDGSTKEDQLVDALAAAP I 

660 

1981 

GCAGGTAGATTTAAGGAGTCTCCAGCTCCAATCATACTAGCTACTGATACTTTATCTTCT 2 04 0 
+ + + + + 

661 AGRFKESPAPIILATDTLSS 

680 

2041 

GACCAAAATGTAGCTGTAAGTAAAGCAGTTCCTAAAGATGGTGGAACTAACTTAGTTCAA 2100 
+ + + + + 

681 DQNVAVSKAVPKDGGTNLVQ 

700 

2101 GTAGGTAAAGGTATAGCTTCTTCAGTTATAAACAAAATGAAAGATTTATTAGATATG 
2157 

4. + + + + 

701 VGKGIASSVINKMKDLLDM 

719 



